[GSoC 2026 Draft] Chat with your note collection using AI (Local-First RAG)

chimzyfire-ship-it · 16 March 2026 08:42

Thanks for the heads up! It looks like the system won't let me edit my original post at the top of the thread anymore. I've also completely revamped the architecture to a 100 percent offline model to eliminate the need for any cloud API keys and ensure strict privacy.

LINKS

Link to the project idea: gsoc/ideas.md at master · joplin/gsoc · GitHub
GitHub profile: chimzyfire-ship-it
Forum introduction post: Welcome to GSoC 2026 with Joplin! - #19 by chimzyfire-ship-it
Pull requests you have submitted to Joplin: * Merged: Desktop: Fix accessibility issue with Mermaid charts (#14617)
- Open: Desktop: Fixes #13119: Throw error for joplin.views.panels.isActive (#14735) (Currently discussing JSDoc deprecation approach with maintainer)

1. Introduction

Background / studies: Computer Science student at Benson Idahosa University (Nigeria) with a strong focus on software engineering, modern web architectures, and the React ecosystem.
Programming experience: Solid experience with JavaScript, Node.js, React Native, React.js, CSS, and API integrations.
Experience with open source: Active contributor to the Joplin desktop application. I have successfully navigated the frontend stack to merge UI fixes and am currently engaged in addressing state management bugs within the joplin.views.panels API, giving me direct experience with the infrastructure required for this project.

2. Project Summary

What problem it solves: Currently, users cannot query their notes using natural language without sending personal data to a third-party cloud API (like OpenAI), which inherently violates Joplin's privacy-first, offline-capable philosophy.
What will be implemented: A 100% offline, Local-First Retrieval-Augmented Generation (RAG) architecture. It uses local embedding models, a local vector database, and a hardware-aware "Silent Switch" inference engine to generate answers entirely on-device.
Expected outcome: A seamless, highly responsive chat panel where users can ask questions and receive synthesized answers backed by citations to their local notes, with zero configuration and zero data leaving their machine.

3. Technical Approach

Architecture or components involved: The system relies on a Background Document Processor for chunking Markdown, a Local Vector Store for embeddings, a Hardware-Aware Inference Engine for generation, and a React UI Chat Panel.
Changes to the Joplin codebase: Implementation will be structured as a Joplin Plugin utilizing the joplin.views.panels API, with heavy asynchronous IPC (Inter-Process Communication) to ensure LLM inference never blocks the main Electron UI thread.
Libraries or technologies you plan to use: LanceDB (or SQLite VSS) for local serverless vector storage, and Transformers.js (or WASM-ported llama.cpp) for on-device embedding and LLM inference entirely within the Node.js environment.
Potential challenges: Running LLMs locally often isolates users on older hardware. Instead of relying on external Cloud API keys (BYOK) which introduce UX friction and privacy risks, I will implement a Two-Tier Silent Switch. Upon initialization, the system will profile the hardware (RAM/GPU). High-end devices will silently load a highly capable quantized model (e.g., Llama-3-8B), while older devices will gracefully degrade to a highly optimized, sub-2-Billion parameter Micro-Model (e.g., Qwen-1.5B/TinyLlama) ensuring 100% offline functionality for all users.

4. Implementation Plan

Weeks 1–2: Analyze Markdown structures to build intelligent document processing/chunking logic and prototype the local embedding generation.
Weeks 3–5: Integrate the local Vector Store (LanceDB) and implement the background indexing workflow to populate embeddings seamlessly.
Weeks 6–7: Integrate the Local Inference Engine and build the hardware profiling logic to smoothly switch between Tier 1 (GPU) and Tier 2 (CPU/Micro) models without user intervention.
Weeks 8–9: Design and implement the React UI chat panel using joplin.views.panels, establishing asynchronous IPC messaging between the UI and the generation engine.
Weeks 10–12: Optimize memory footprint for low-end devices, conduct extensive integration testing on massive note collections, and finalize user/developer documentation.

5. Deliverables

Implemented features: A background note indexer and a 100% offline, hardware-aware AI chat panel integrated into the desktop application.
Tests: A robust suite of unit and integration tests covering Markdown chunking, retrieval accuracy, and graceful model degradation on restricted hardware.
Documentation: Clear developer guides on the IPC/RAG architecture and user-facing documentation highlighting the zero-config privacy features.

6. Availability

Weekly availability during GSoC: 40 hours per week.
Time zone: West Africa Time (WAT) / UTC+1 (Lagos, Nigeria).
Any other commitments during the programme: None. I have no summer classes or other employment; GSoC with Joplin will be my sole full-time focus.

Topic		Replies	Views
GSoC Idea Discussion: Chat with your note collection using AI – architecture and LLM approach Development	5	145	13 March 2026
GSoC 2026: Opportunities for the AI projects GSoC	32	683	13 April 2026
GSoC 2026 Proposal Draft – Idea 4: Chat with your note collection using AI GSoC	0	19	31 March 2026
Plugin submission: AI Chatbot Assistant Plugins	0	254	8 February 2025
Question regarding GSoC 2026 GSoC	7	129	25 March 2026