GSoC 2026 Proposal Draft – Idea 4: Chat with your note collection using AI – Harsh16gupta

Harsh16gupta · 29 March 2026 02:31

I did got through it and ran the embedding model (BGE-small-en-v1.5 and all-MiniLM-L6-v2) with transformer.js in the plugin environment.

My findings: BGE-small takes roughly twice as long to embed the same notes compared to all-MiniLM-L6-v2. Both models share the same max token limit (512)(this one is little confusing MiniLM no proper mention of token limit ) and output dimension (384), so the quality loss is minimal while the speed gain is very large.

I have added all the images of the results in the GitHub repo readme: testing-embedding-model

I have explained it in more detail: here

The architecture does not close it off, but there is one thing that would need to change, the vector storage. I am currently using Vectra, which stores embeddings as local files on disk. Mobile plugins don't have filesystem access, so that would break. To fix this I can use joplin's built-in API, that stores data directly on notes.

Other barier:
Ollama will not work on mobile but openAI and Gemini will work perfectly.

Topic		Replies	Views
GSoC 2026: Opportunities for the AI projects GSoC	38	1042	20 May 2026
GSoC Idea Discussion: Chat with your note collection using AI – architecture and LLM approach Development	5	151	13 March 2026
Design Discussion: Shared Embedding & Retrieval Infrastructure for Joplin AI Features GSoC	1	71	26 March 2026
Plugin: Jarvis (AI assistant) - also on mobile [v0.13.1, 2026-05-15] Plugins	182	27609	15 May 2026
LangChain / LlamaIndex Joplin integrations for developing AI apps Apps	3	1407	7 May 2025

GSoC 2026 Proposal Draft – Idea 4: Chat with your note collection using AI – Harsh16gupta

Related topics