GSoC 2026 Proposal Draft – Idea 4: Chat with your note collection using AI – Harsh16gupta

I did got through it and ran the embedding model (BGE-small-en-v1.5 and all-MiniLM-L6-v2) with transformer.js in the plugin environment.

My findings: BGE-small takes roughly twice as long to embed the same notes compared to all-MiniLM-L6-v2. Both models share the same max token limit (512)(this one is little confusing MiniLM no proper mention of token limit ) and output dimension (384), so the quality loss is minimal while the speed gain is very large.

I have added all the images of the results in the GitHub repo readme: testing-embedding-model

I have explained it in more detail: here

The architecture does not close it off, but there is one thing that would need to change, the vector storage. I am currently using Vectra, which stores embeddings as local files on disk. Mobile plugins don't have filesystem access, so that would break. To fix this I can use joplin's built-in API, that stores data directly on notes.

Other barier:
Ollama will not work on mobile but openAI and Gemini will work perfectly.

1 Like