Chat with Your Note Collection Using AI - A RAG-Powered Joplin Plugin
Links
-
Project idea: Idea #4 - Chat with your note collection using AI
-
Primary mentor: @shikuz
-
Secondary mentors: @HahaBill , @malekhavasi
-
Name: Chanrattnak Mong
-
GitHub profile: github.com/rattnak
-
Forum introduction post: discourse.joplinapp.org
-
Pull requests submitted to Joplin:
- PR #14933 – Fixed potential TypeError in
syncedCount,deletedItemCount, anditemTypewhenselectOne()returns null; corrected return type ofJoplinData.itemType()fromPromise<ModelType>toPromise<ModelType | null>(open, under review)
- PR #14933 – Fixed potential TypeError in
-
Other relevant development experience:
- json-schema-studio PR #157 – Hoist keyword set constants to module scope (merged)
- json-schema-studio PR #187 – Add SEO and social media meta tags (merged)
- json-schema-studio PR #200 – Fix NodeDetailsPopup responsive layout on narrow viewports (merged)
- json-schema-studio PR #209 – Fix format toggle editor panel placement (merged)
- OWASP Nest PR #4395 – Remove unused react-router-dom from frontend (merged)
1. Introduction
I am Chanrattnak Mong (Nak), based in Lynn, MA, USA (Eastern Time, UTC-4). I graduated with a B.Sc. in Computer Science from Fort Hays State University in December 2025, Magna Cum Laude (3.89 GPA). I currently work remotely as a Systems & Automation Analyst at FHSU, building ETL pipelines that sync course data across enterprise platforms via REST APIs and scheduled server jobs.
My core stack is TypeScript and React. I have shipped multiple full-stack projects end-to-end: a cross-platform fintech app (React Native, PostgreSQL, Plaid SDK) with 90%+ Jest coverage; a full-stack encryption app (React, TypeScript, Flask); and ETL pipelines processing 100K+ records in Python/SQL/Snowflake. I have completed three fully remote internships across three time zones (Italy, Singapore, USA), which mirrors the async communication model of GSoC mentorship.
My open-source contributions span three repositories (Joplin, JSON Schema Studio, OWASP Nest), giving me practical experience navigating unfamiliar codebases and iterating through PR review cycles.
2. Project Summary
Joplin users often build large, carefully curated note collections over years of research, reading, and web clipping. Despite this investment, the knowledge locked inside those notes is difficult to access: keyword search only works when you already know what to look for, and returns nothing if the words in your query don't match the words in the notes.
This project addresses that gap by treating the note collection as a personal knowledge base that the user can converse with. Rather than searching for terms, users ask questions in plain language and receive answers synthesized from their own content — with the ability to ask follow-up questions to dig deeper or refine the response. The interaction model is familiar (a chat interface), but the knowledge source is entirely the user's own notes, not a generic AI.
This proposal implements that as a Joplin desktop plugin using Retrieval-Augmented Generation (RAG). Notes are chunked, embedded locally, and stored in a SQLite vector index. At query time the most semantically relevant chunks are retrieved and passed to an LLM alongside the user's question. Every answer includes source citations that link back to the original notes, so users can verify and explore rather than just trust the output.
What will be implemented:
- Hybrid chunking pipeline (heading-aware + sliding window)
- Local embedding support via
@xenova/transformers(no data leaves the device by default) - sqlite-vec vector index with incremental sync
- Multi-provider LLM support (OpenAI, Anthropic, Ollama)
- React chat UI with conversational follow-up, streaming responses, source citations, and notebook filtering
Expected outcome: A production-ready Joplin plugin submitted to the official plugin repository, with documentation and a test suite. Estimated project size: 350 hours, matching the official idea scope.
Out of scope: Mobile support, voice input, image OCR, exporting chat as notes (post-GSoC stretch goals).
3. Technical Approach
Architecture
User Query
|
├── Query Embedding (local or API) --> SQLite Vector Index (sqlite-vec)
└── Joplin Search API (BM25 keyword)
|
v
Reciprocal Rank Fusion [score = Σ 1/(k + rank_i), k=60]
|
v
Top-K Relevant Note Chunks
|
v
Prompt Construction (system prompt + chunks + conversation history)
|
v
LLM API (OpenAI / Anthropic / Ollama)
|
v
Chat UI Response (with source citations linking to notes)
Joplin codebase changes
This project is implemented entirely as a plugin — no changes to Joplin core are required. It uses the stable plugin API (joplin.data, joplin.workspace, joplin.views.panels, joplin.settings) and the existing onNoteChange event for incremental sync.
The retrieval layer is designed behind an abstract interface exposing put(note) and query(text, options) so the conversational, prompt, and UI layers have no dependency on a specific storage or embedding implementation. This directly addresses the shared embedding infrastructure @shikuz raised in the "Opportunities for the AI projects" forum thread: if a shared index project is developed separately across Joplin AI plugins, this plugin can migrate to it by swapping the retrieval implementation without touching the chat, prompt construction, or UI layers.
Ingestion pipeline
Notes are fetched via joplin.data.get(['notes'], { fields: ['id', 'title', 'body', 'parent_id', 'updated_time'] }) in batches of 50. Each note body is SHA-256 hashed; unchanged notes are skipped on subsequent runs. Notes are chunked using a hybrid strategy: first split at Markdown heading boundaries (H1–H3) to preserve semantic sections, then subdivide large sections with a 400-token sliding window (50-token overlap), and merge fragments under 100 tokens with adjacent chunks. Each chunk stores its note ID, title, chunk index, heading path, and text. Before embedding, each chunk is prepended with a Contextual Chunk Header containing its note title and heading path — for example, [Q3 Sprint Plan > Backend Tasks] — so the embedding captures both content and structural context. Without this, a chunk like "migrate to the new provider" loses the context that it concerns authentication infrastructure, degrading retrieval precision for structurally similar phrases across unrelated notes.
Embedding and vector storage
Default: all-MiniLM-L6-v2 via @xenova/transformers (384-dimensional, 23 MB, runs fully in-process via ONNX — no data leaves the device). Optional: OpenAI text-embedding-3-small (1536-dimensional, requires API key). Embeddings are stored in a local SQLite database using sqlite-vec, which integrates naturally since Joplin already uses SQLite internally.
Retrieval and generation
At query time, two searches run in parallel: a cosine-similarity search against the sqlite-vec index, and a keyword search via joplin.data.get(['search'], { query: userQuestion }) which uses Joplin's built-in BM25 engine. The two ranked lists are merged using Reciprocal Rank Fusion with score = Σ 1/(k + rank_i) where k=60, and the top-K chunks (default K=5) are selected from the fused result. Pure vector search misses exact keyword matches — names, acronyms, specific terms — while BM25 misses semantic similarity; combining both via RRF captures what either alone would miss. An optional notebook/tag filter is applied before the merge, and duplicate chunks from the same note are removed after. Prompt construction enforces a token budget split: approximately 300 tokens for the system instruction, up to 50% of the remaining budget for retrieved note excerpts (prioritised as grounding data), the remainder for conversation history (oldest turns dropped first), and approximately 1,200 tokens reserved for generation — ensuring retrieved notes always have guaranteed space regardless of conversation length. LLM providers (OpenAI, Anthropic, Ollama) are abstracted behind a unified interface; streaming is supported via each provider's streaming API.
Chat UI
Implemented as a joplin.views.panels React webview with: message thread, source citation links (opens note via joplin.commands.execute('openNoteInNewWindow', noteId)), notebook filter selector, indexing progress indicator, dark mode support aligned with Joplin's theme system, and keyboard accessibility.
Libraries and technologies
@xenova/transformers– local ONNX embeddingssqlite-vec– vector similarity search in SQLite- OpenAI SDK, Anthropic SDK, Ollama REST API – LLM providers
- React – chat UI webview
- Jest – unit and integration tests
Potential challenges
- Loading an ONNX model in Electron's renderer process via Webpack requires careful configuration of
wasmloading and asset handling. sqlite-vecships as a native extension; it must be bundled per-platform (Windows, macOS, Linux) and loaded at runtime. If unavailable, the plugin falls back to in-memory brute-force search.
Testing strategy
- Unit tests for the chunking module (edge cases: empty notes, heading-only notes, very long notes), hash-based change detection, and retrieval deduplication logic.
- Integration tests using a controlled set of fixture notes to verify end-to-end retrieval quality and citation accuracy.
- Manual cross-platform testing on Windows, macOS, and Linux.
Documentation plan
- User-facing: setup guide, provider configuration, privacy FAQ explaining exactly what data is transmitted under each configuration.
- Developer-facing: architecture overview and instructions for running tests and building the plugin locally.
4. Implementation Plan
Community Bonding (May 1–24)
- Deep-dive into Joplin's plugin architecture and monorepo build system
- Study existing plugins for patterns (especially those using webviews and settings)
- Set up development environment; validate ONNX model loading and sqlite-vec in the Electron/Webpack environment
- Discuss scope adjustments with mentors
Week 1–2
- Plugin scaffold with
yo joplin, TypeScript build pipeline - Note fetching with pagination and SHA-256 change detection
Week 3–4
- Hybrid chunking module (heading-based + sliding window)
@xenova/transformerslocal embedding integration- sqlite-vec schema setup; insert, query, delete operations
- Unit tests for chunking, embedding, and storage layers
Deliverable (end of Phase 1): Notes are fetched, chunked, embedded, and stored in sqlite-vec — verified end-to-end on a test collection.
Week 5–6
- Full ingestion pipeline (all notes → chunks → embeddings → index)
- Incremental sync via
onNoteChangelistener
Week 7–8
- Retrieval module: vector search + chunk deduplication + source mapping
- LLM provider abstraction (OpenAI and Anthropic); prompt construction with citations
- Streaming response support
- Integration tests with controlled fixture notes
Midterm Evaluation: July 10, 2026
Deliverable (end of Phase 2): Full RAG pipeline working end-to-end. User can query notes and receive grounded, cited answers.
Week 9–10
- React chat UI webview: message thread, source citation links, notebook filter
- IPC wiring between webview and plugin for queries and streaming tokens
Week 11
- Indexing progress indicator and settings panel
- Dark mode and Joplin theme integration
- Keyboard navigation and screen reader accessibility
- Ollama support for fully offline operation
Deliverable (end of Phase 3): Polished, accessible chat interface with multi-provider support.
Week 12–13
- Performance profiling and optimization on large collections (10K+ notes)
- Cross-platform testing (Windows, macOS, Linux)
- Edge case handling (empty notes, very long notes, image-only notes)
Week 14
- End-user documentation: setup guide, provider configuration, privacy FAQ
- Code cleanup, final mentor review iterations
- Prepare plugin for submission to Joplin plugin repository
Final Evaluation: August 24, 2026
5. Deliverables
- A Joplin desktop plugin enabling natural-language chat over the user's note collection
- Local-first RAG pipeline with hybrid chunking, local embeddings, and sqlite-vec vector storage
- Incremental sync that re-indexes only changed notes
- Multi-provider LLM support (OpenAI, Anthropic, Ollama) behind a unified interface
- React chat UI with streaming responses, source citations, and notebook filtering
- Unit and integration test suite
- End-user documentation and privacy guide
6. Availability
Weekly availability: 25–30 hours per week dedicated to GSoC. I will shift my employment hours to evenings and weekends during the coding period, keeping morning blocks (9 am–1 pm Eastern, Monday–Friday) reserved exclusively for GSoC work.
Time zone: US Eastern (UTC-4)
Other commitments: I work remotely full-time as a Systems & Automation Analyst at FHSU. I have no academic commitments (graduated December 2025). I have discussed the GSoC timeline with my manager and have a flexible scheduling arrangement in place. I will communicate proactively with mentors if any conflicts arise.