GSoC 2026: Opportunities for the AI projects

Harsh16gupta · 19 March 2026 18:26

The idea of having a shared embedding index makes sense, but the way each idea uses embeddings is a bit different. the ideas can be grouped into two types:

Chunk based projects
Idea 1 (AI Search) and Idea 4 (Chat with notes) need chunk level embeddings, since they retrieve specific parts of notes.
Note based projects
Idea 3 (Categorisation) and Idea 2 (Note graphs) mostly compare whole notes, so a single vector per note is enough. This can be created by averaging the chunk embeddings.

At the same time, the retrieval logic is not same for every idea. For example, search and chat would use similarity based retrieval (possibly with reranking and RAG), while categorisation would use clustering.

Topic		Replies	Views
Design Discussion: Shared Embedding & Retrieval Infrastructure for Joplin AI Features GSoC	1	62	26 March 2026
Welcome to GSoC 2026 with Joplin! GSoC	155	1878	1 April 2026
AI project Discussion ( Project 1 : AI-supported search for notes) Development	4	120	31 March 2026
Proposal: A local-based semantic search engine Integration Features	1	230	6 March 2025
AI agents and Joplin Apps	9	2031	13 March 2026

GSoC 2026: Opportunities for the AI projects

Related topics