Links
- Project idea: gsoc/ideas.md at master · joplin/gsoc · GitHub
- GitHub profile: Payel-Manna · GitHub
- Forum introduction: Welcome to GSoC 2026 with Joplin! - #52 by Payel-Manna
- Pull request: #14779 — Desktop: Add search field to Config screen to filter settings (open, all CI checks passing)
1.Introduction
Background / Studies
I am a Computer Science student at Scaler School Of Technology, Bengaluru, India. My academic background includes operating systems, computer networking, classical machine learning, and databases (MySQL, PostgreSQL)
Programming Experience
I primarily work with TypeScript, JavaScript, React, Node.js, Python, and Java.
Some relevant projects:
- EduCompanion — Full-stack MERN application with API design and database integration
- LearnFlow — React-based frontend demonstrating component architecture and UI design
- TrueTribe — Python/Flask backend system
I have also completed a Coursera certification in Agentic AI, covering:
- LLM orchestration
- Memory systems
- RAG pipelines
Open source experience: My Joplin contribution is PR #14779, which adds a real-time search field to the desktop Config screen. This gave me direct experience with Joplin's React component architecture, config-shared.js, the SearchInput API, ESLint conventions, CI pipeline (12 checks across Ubuntu, macOS, Windows), and the Playwright integration test suite. I also have two Hacktoberfest merged PRs. Before writing this proposal I studied the Jarvis plugin source to understand panel IPC and the constraints of the plugin sandbox.I have already contributed to Joplin and understand its plugin architecture, which allows me to start implementation immediately without a ramp-up phase.
2. Project Summary
The Problem
Joplin users often build large, carefully curated knowledge bases containing hundreds or even thousands of notes. However, the current keyword-based search is limited — it cannot handle semantic queries such as:
“Summarise my notes on distributed systems”
“What did I write about async patterns last month?”
It also cannot surface connections across multiple notes. As a result, users must manually browse and recall where information is stored — a process that becomes increasingly inefficient as the collection grows.
What Will Be Built
I propose to build ARIA (Adaptive Retrieval and Intelligence Assistant) — a Joplin plugin that enables users to interact with their notes through a ChatGPT-style sidebar interface.
With ARIA:
- The user asks a natural language question
- The system retrieves relevant notes
- Synthesises grounded answers across multiple notes with source citations
The system supports multi-turn conversations, allowing users to ask follow-up questions while continuously improving based on prior interactions.
What Makes This Different from Basic RAG
ARIA goes beyond a simple retrieve → generate pipeline by introducing three key innovations:
-
Persistent Context Memory
Remembers past sessions, learns user preferences, and builds a topic graph over time -
Multi-Stage Retrieval Pipeline
Combines BM25 keyword search with semantic retrieval, followed by context-aware re-ranking -
Cross-Note Synthesis
Identifies overlapping topics across notes and structures responses accordingly
Privacy Model
ARIA is designed with a privacy-first architecture:
- All embeddings and indexing are performed locally on-device
- Ollama enables fully offline operation
- OpenAI-compatible APIs are opt-in only
- Only relevant note excerpts are sent — never the full collection
- All memory (conversation, user preferences, topic graph) is stored in local SQLite
- No data is synced to Joplin Cloud
Why a Plugin
A plugin has direct access to joplin.data API without requiring the user to run a separate server or configure OAuth. An external application would require either a local API server or Joplin's Web Clipper service to be running — adding friction and a persistent background process.
The existing Jarvis plugin demonstrates that advanced local AI features are feasible within Joplin’s plugin environment. ARIA builds on this foundation with a more structured and scalable RAG + memory architecture.
3. Technical Approach
Architecture Overview
The system is organised into five layers:
- Ingestion Layer — Fetches and indexes notes locally from Joplin
- Storage Layer — Stores embeddings and memory in SQLite
- Query Layer — Retrieves relevant chunks and interacts with the LLM
- Memory Layer — Enriches each query with session context and learned signals
- UI Layer — Streams responses to a React-based sidebar panel
Embedding Model
The system uses BGE-small-en-v1.5 (24 MB, MTEB retrieval ~51.7) via ONNX runtime.
- Chosen over BGE-large (335 MB, ~54.3) because a ~3 point quality gain does not justify a ~14× increase in size
- Runs entirely on-device — no external server required
- Cloud embedding via
text-embedding-3-smallis supported as opt-in
Chunking Strategy
- Notes are first split at H1 / H2 / H3 heading boundaries
- Each section is further divided using a 400-token sliding window with 50-token overlap
- Sections under 100 tokens are merged with adjacent sections
This hybrid strategy works well for both:
- Structured notes (with headings)
- Unstructured notes (free-form text)
Storage Layer
The system uses sqlite-vec for vector similarity search.
- Fully compatible with existing SQLite usage
- No external services or processes required
Tables:
embeddings— vector representations of chunkschunks— note text with metadataindex_state— incremental sync tracking via content hashesmemory_conversation,memory_user,memory_topic_graph— persistent memory stores
Native Dependency Risk
sqlite-vec is a native C++ extension and must be validated against Electron’s ABI during community bonding.
- If incompatible, fallback to vectra (pure JavaScript, file-backed JSON vector store)
hnswlib-nodeis not used as fallback (also native C++ → same risk)
Even in the worst case where ONNX or vector search is unavailable, the system falls back to BM25-based retrieval with LLM summarisation, ensuring a functional feature is always delivered.
RAG Query Flow
- Load memory context (conversation history, user knowledge, topic graph)
- Embed user query via BGE-small ONNX → 384-dimensional vector
- Perform hybrid retrieval:
- BM25 keyword search
- Semantic vector search
→ union of ~20 candidates
- Apply context-aware re-ranking:
- Recency
- Note importance
- Prior interactions
- Memory signals
→ select top-5 diverse chunks
- Perform cross-note synthesis to detect shared topics and structure context
- Stream LLM response → extract key facts → update memory stores
Incremental Delivery Strategy
The system is designed to be delivered in stages to ensure a functional result at all times.
- By midterm, a complete hybrid RAG pipeline (BM25 + semantic retrieval + answer generation) will be fully operational
- Memory and synthesis layers are added incrementally on top of this stable core
- If advanced components are delayed, the system gracefully degrades to a fully working retrieval + summarisation pipeline
This ensures that a usable and valuable feature is always delivered regardless of complexity trade-offs.
Memory Layer
Three SQLite-backed memory stores:
-
memory_conversation
Stores past Q&A and extracted key facts for multi-turn continuity -
memory_user
Stores inferred expertise and preferences
Example:
"topic:os_scheduling" → "advanced" -
memory_topic_graph
Stores relationships between topics discovered across sessions
All memory is:
- Stored locally
- User-clearable
- Never sent to external APIs
Performance Architecture
Targets are based on preliminary benchmarking of BGE-small ONNX during prototype validation.Actual performance may vary based on hardware, these targets represent expected performance on mid-range systems.
- Indexing runs in a Node.js worker thread → zero UI blocking
- ONNX batch inference (32 chunks per pass) → ~20× faster than sequential
- Incremental sync uses SHA-256 hashing → unchanged notes skipped
- 2-second debounce on
note.onChangeevents - Notes fetched in pages of 50 → stable RAM usage
Performance Targets
| Metric | Target |
|---|---|
| First-run index (1000 notes) | ~40 seconds |
| Incremental update per note | < 1 second |
| RAM usage (idle) | < 150 MB |
| Retrieval latency | < 200 ms |
| UI thread blocking | Zero |
Key Challenges and Mitigations
| Challenge | Mitigation |
|---|---|
| sqlite-vec ABI mismatch in Electron | Validate in bonding period; fallback to vectra (pure JS) |
| ONNX runtime in plugin sandbox | Validate early; fallback to Ollama nomic-embed-text |
| Context window overflow | Token budget — trim oldest turns, then lowest-ranked chunks |
| Ollama not installed | Auto-detect on launch; onboarding guidance |
| Topic extraction quality | Use heading paths (human-curated) + stop-word filtering |
| Incorrect cross-note associations | Conservative thresholds + hedged prompt language |
The primary success criteria is a robust and usable RAG pipeline integrated into Joplin.
All advanced features (memory, synthesis, re-ranking improvements) are layered enhancements and will not compromise core delivery.
4. Implementation Plan
Each phase ends with a working, testable feature to ensure continuous integration and feedback.
Community bonding (May 1–26): Validate sqlite-vec and ONNX runtime inside Electron. Build working prototype — index 20 notes, retrieve top-3, confirm vectra fallback. Commit to public GitHub. Align with mentors on scope and priorities.
A minimal prototype validating embedding and retrieval on a small note set will be completed during the community bonding period and shared publicly.
Week 1–2 (May 27–Jun 9) — Foundation: Plugin scaffold, TypeScript build, settings page, SQLite database with all six tables, basic React chat panel with IPC. Deliverable: Plugin installs, database initialises, notes fetch correctly.
Week 3–4 (Jun 10–Jun 23) — Ingestion pipeline: Worker thread indexer, SHA-256 incremental sync, 2-second debounce on note.onChange, heading-first chunking, BGE-small batch ONNX embedding, WAL bulk writes, progress bar in panel. Deliverable: 1000 notes indexed in ~40 seconds, incremental updates under 1 second.
Week 5–6 (Jun 24–Jul 7) — RAG query engine: BM25 index, semantic vector search, hybrid union, basic re-ranking, grounding prompt, OpenAI and Ollama providers with streaming, source citation cards with openNote. Deliverable: End-to-end RAG working, user receives streamed answer with clickable citations.
Week 7–8 (Jul 8–Jul 21) — Memory layer: Conversation memory store, key fact extraction, user knowledge profiling, topic graph construction, memory-enriched prompt assembly, token budget management, memory settings panel. Deliverable: ARIA remembers sessions, adapts to user. Midterm target.
Week 9–10 (Jul 22–Aug 4) — Synthesis, filtering, resilience: Cross-note synthesiser, full re-ranking with all four signals, notebook and tag filtering, note.onDelete handler, full error handling matrix, Joplin theme integration. Deliverable: Second brain synthesis, notebook filtering, all failure cases handled gracefully.
Week 11–12 (Aug 5–Aug 19) — Testing, performance, documentation: Full unit and integration test suite, cross-platform testing (Windows/macOS/Linux), performance benchmarks logged and committed, four user guides (installation, Ollama, OpenAI, privacy FAQ), developer guide, code cleanup, release tag v0.1.0-gsoc. Deliverable: All performance targets met, full test suite passing, plugin documented and released.
The system is designed incrementally — a fully functional hybrid RAG pipeline will be completed by midterm, with memory and synthesis layers added progressively if time permits.
Weekly progress updates and incremental demos will be shared with mentors to ensure alignment and early feedback.
Timeline Summary
| Period | Dates | Key Deliverable |
|---|---|---|
| Community Bonding | May 1 – May 26 | Prototype on GitHub |
| Week 1–2 | May 27 – Jun 9 | Plugin foundation |
| Week 3–4 | Jun 10 – Jun 23 | Ingestion pipeline |
| Week 5–6 | Jun 24 – Jul 7 | RAG query engine |
| Week 7–8 | Jul 8 – Jul 21 | Memory layer (midterm) |
| Week 9–10 | Jul 22 – Aug 4 | Synthesis + resilience |
| Week 11–12 | Aug 5 – Aug 19 | Testing + documentation + release |
5. Deliverables
Core (Guaranteed by Midterm )
- End-to-end RAG pipeline:
- Note ingestion and chunking
- Hybrid retrieval (BM25 + semantic search)
- Context-aware re-ranking
- Streaming LLM responses with source citations
- Local-first embedding using BGE-small via ONNX (no external dependency required)
- Incremental sync using SHA-256 hashing with event-driven updates
- Support for:
- Ollama (fully offline)
- OpenAI-compatible APIs (opt-in)
- Notebook and tag-based filtering
- Responsive React sidebar chat panel integrated into Joplin
- Graceful fallback mechanisms (BM25-only retrieval if embeddings unavailable)
Extended (Planned, Delivered After Core Stabilisation)
- Persistent context memory:
- Conversation history
- User knowledge signals
- Topic graph
- Cross-note synthesis highlighting relationships across notes
- Advanced re-ranking signals (interaction history, memory-aware boosting)
Quality
- Unit and integration tests covering all core pipeline components
- Cross-platform validation (Windows, macOS, Linux)
- Performance benchmarks recorded and documented
- User documentation:
- Installation guide
- Ollama setup
- OpenAI setup
- Privacy and data handling FAQ
- Clean, production-quality code:
- No
anytypes - Full JSDoc coverage
- Zero ESLint warnings
- No
6. Availability
- Weekly hours: 40+ hours per week throughout GSoC
- Time zone: UTC+5:30 (IST, Bengaluru) — available for overlap with UTC and CET mentor windows
- Exam overlap: Possible conflict with early community bonding (late April–mid May). Minimum 20 hours/week during that period, full time from May 27
- Communication: Weekly Monday forum update (completed / planned / blockers), daily commits on public fork, available for mentor calls UTC 06:00–14:00 weekdays