GSoC 2026 Proposal Draft – Idea 4: Chat with your note collection using AI–lakshsarda137

Links:

  1. Introduction
  • Background/Studies: I’m Laksh, an undergraduate student majoring in Computer Science and Mathematics at Rice University in the United States. I’ve taken courses on DSA, Data Science & Machine Learning, and Linear Algebra, among other courses.
  • Programming experience: I’ve been building projects alone or with my friends for the past 3-4 years, and have learnt Python, TypeScript, JavaScript, and multiple frameworks on AWS/GCP/Azure along the way.
  • Experience with open source: I’ve only forked open source repositories and customized them to meet my needs in the past. However, I have now started contributing to open source repositories.
  1. Project Summary
  • What problem it solves: As users start using Joplin for longer periods of time, they may struggle to find notes they took on a relevant topic a while ago.

  • What will be implemented: A Joplin plugin that lets users chat with their entire note collection through an LLM-powered conversational interface. The plugin will use Retrieval-Augmented Generation (RAG) to generate LLM responses in the user's actual notes, ensuring answers are factual and traceable.

    Expected outcome: Users can open a panel in Joplin, ask natural language questions about their notes, and receive accurate answers with clear references to the source notes. Follow-up questions refine the conversation. There should be no hallucination; every claim is backed by retrieved note content.

  1. Technical Architecture

Architecture:

The plugin follows a standard RAG pipeline with three core components:

User Query → Embedding + Retrieval → Context Assembly → LLM Generation → Response with Citations

  1. Indexing Pipeline: On the first run (and incrementally thereafter), the plugin reads all notes via joplin.data API, chunks them, generates embeddings, and stores them in a local vector database.

  2. Retrieval Engine: When the user asks a question, the query is embedded, and the top-k most relevant note chunks are retrieved via similarity search.

  3. Generation with Citations: Retrieved chunks are assembled into a prompt with the user's question, sent to the LLM, and the response includes references to source note IDs/titles that the user can click to navigate to.

Components involved + Libraries/Technologies I plan to use:

Component Technology Rationale
Vector Store Vectra Pure TypeScript, no native dependencies, runs in plugin sandbox, stores index locally alongside Joplin data
Embeddings OpenAI text-embedding-3-small or local alternative (e.g. Transformers.js) User-configurable; OpenAI for quality, local for privacy
LLM OpenAI / Anthropic / Ollama (user-configurable) Flexibility: users who want privacy can use local Ollama models
Chunking Recursive character text splitter Splits notes into ~500-token chunks with overlap, preserving note metadata (ID, title) per chunk
UI Joplin Panel API Chat interface rendered as a sidebar panel using HTML/CSS/JS

Changes to the Joplin Codebase:

This project is implemented as a plugin, so no changes to Joplin core are needed. The plugin uses:

  • joplin.data.get(['notes'], { fields: ['id', 'title', 'body', 'updated_time', 'parent_id'] }) for note access

  • joplin.data.get(['folders']) and joplin.data.get(['tags']) for context

  • joplin.workspace.onNoteChange() for incremental index updates

  • joplin.views.panels.create() for the chat UI

  • joplin.settings.registerSection() / registerSettings() for API key configuration and model selection

  • joplin.commands.register() for keyboard shortcut to open chat

For the technical architecture, I also plan to use a couple of strategies to avoid a couple of problems I foresee:

Incremental Indexing Strategy

Rather than re-indexing all notes on every launch:

  1. Store updated_time per indexed note

  2. On startup, query notes with updated_time > last_index_time

  3. Re-embed only changed/new notes; remove deleted notes from the index

  4. This keeps the plugin responsive even with thousands of notes

Handling Hallucination

  • Strict grounding prompt: The system prompt instructs the LLM to only answer based on the provided context and to say "I don't have information about that in your notes" when the context is insufficient

  • Citation enforcement: Each response chunk is tagged with source note IDs; the UI renders these as clickable links

  • Confidence thresholding: If retrieved chunks have low similarity scores, warn the user that results may be incomplete

Potential Challenges:

  • Large note collections: Embedding thousands of notes takes time. Mitigated by incremental indexing and a progress indicator during initial setup.

  • Plugin sandbox constraints: Plugins run in a sandboxed environment. Vectra's pure-JS approach avoids native dependency issues. For local models, Ollama runs as a separate process the plugin communicates with via HTTP.

  • Privacy concerns: Some users won't want notes sent to external APIs. The plugin will support fully local pipelines (Ollama + local embeddings) as an option.

  • Context window limits: Long notes need chunking. The chunking strategy preserves semantic coherence and overlaps chunks to avoid losing context at boundaries.

4. Implementation Plan

Week 1–2

  • Implement note ingestion pipeline: fetch all notes via data API, chunk them with metadata

  • Implement embedding generation (OpenAI API integration)

  • Set up Vectra vector store, store and persist embeddings locally

Week 3–4

  • Implement query pipeline: embed user query, retrieve top-k chunks, assemble context

  • Implement LLM integration for answer generation with citation extraction (only external providers for now)

  • Basic CLI/console-based testing of the full RAG pipeline

Week 5–6

  • Build the chat panel UI using Joplin's Panel API

  • Implement conversation history (multi-turn chat)

  • Add clickable note references that navigate to source notes

Midterm Evaluation

  • Deliverable: Working plugin with basic chat functionality, note indexing, and source citations

Week 7–8

  • Implement incremental indexing (only re-embed changed notes)

  • Add onNoteChange listener for real-time index updates

  • Implement settings panel: API key config, model selection, index management

Week 9–10

  • Add support for local/Ollama models as an alternative to cloud APIs

  • Add support for local embeddings (Transformers.js or Ollama embeddings)

  • Performance optimization: batch embedding, caching, lazy loading

Week 11–12

  • Comprehensive testing: unit tests for chunking/retrieval, integration tests for full pipeline

  • Edge case handling: empty notes, very long notes, notes with images/resources

  • Documentation: user guide, developer setup guide, architecture docs

Final Week

  • Final polish, bug fixes from mentor feedback

  • Submit final evaluation

5. Deliverables

  • Joplin plugin (publishable to the Joplin plugin repository) with:

    • Natural language chat interface for querying notes

    • RAG pipeline with local vector storage

    • Clickable source citations linking back to original notes

    • Support for multiple LLM providers (OpenAI, Anthropic, Ollama)

    • Incremental indexing for efficient updates

    • Configurable settings (model, API keys, privacy mode)

  • Tests: Unit tests for chunking, embedding, retrieval, and citation extraction; integration tests for the full query pipeline

  • Documentation: User-facing guide (how to install, configure, and use) and developer documentation (architecture, how to extend)

6. Availability

  • Weekly availability: ~40 hours per week during GSoC

  • Time zone: IST (Indian Standard Time)

  • Other commitments: I will be on summer break from university during the GSoC period, so I can dedicate substantial time to the project. No other major commitments.

I think you should provide more details about how it will work in certain edge cases. Some people have hundreds of thousands of notes, including very large notes. How will your plugin handle this? LLM models have a limited context to work with.

How will it deal with noise like resource IDs, HTML tags, etc.

As for the model not hallucinating certain answers, I don't think we can claim this. Even if based on the actual data, it's data that has been compacted presumably and that may have been misunderstood. But it's good to try to deal with this problem as much as possible

Also it's great if you can make it work with local AI, but that might be hard. You might have to rely on a cloud API, in which case there may be cost considerations - we don't want to use all the users token on first run.

Additionally you don't discuss the UI much - we'd need information about this as it's not that easy to implement well.

Hey @lakshsarda137, Laurent raised several questions about scale, noise handling, and UI - have you had a chance to look at those? Especially curious how you're thinking about scale.