Google Summer of Code — Draft Proposal
Idea 1: AI-supported search for notes
Author: Ashutosh Singh
Google docks Link (Proposal)-Google Summer of Code Draft - Google Docs
Prototype
github repo link - GitHub - Ashutoshx7/ai-search-for-notes-plugin-Joplin- · GitHub
Pull Requests Submitted to Joplin
| PR # | Title | Category | Status |
|---|---|---|---|
| #14684 | Fix/ctrl wheel zoom | Bug Fix | Open |
| #14582 | Desktop: Fixes #9673: Frontmatter export — Include notebook icon | Export/Import | Merged |
| #14664 | Desktop: Fixes #14659: Reuse master password dialog (E2EE) | Bug Fix | Merged |
| #14653 | All: Resolves #12037: Remove JSDOM from Turndown renderer | Refactor | Merged |
| #14577 | Fixes #12793: Prevent failing plugin from blocking others | Bug Fix | Merged |
| #14549 | Desktop: Fixes #14079: Improve updater error on rate limit | Bug Fix | Merged |
| #14527 | Desktop: Fixes #14500: Fix zh_TW locale detection on start | Bug Fix | Merged |
| #14443 | Desktop: Resolves #9336: Add editor & sync target to About | Enhancement | Merged |
| #14398 | Add waving hand emoji to welcome notebook | Database | Merged |
| #14636 | Desktop: Show splash screen during app startup | Feature | Closed |
| #14496 | Desktop: Add Favorites & Note List Preview as default plugins | Feature | Closed |
AI Disclosure & How AI Was Used
This project involved a collaborative process between my own thinking and AI assistance.
I began by writing down initial points and ideas based on my own research and planning. From there, I used AI to help build a prototype using prompts to progressively reach each milestone. As Jonas (creator of Debian Pure Blends) adviced me : no matter how much research or planning you do, there are always things you won't anticipate until you're actually in the middle of building. That's not a failure, that's the nature of the process.
As I built the prototype, new problems and solutions surfaced that I hadn't originally considered. I documented these as I went, and by the end of the prototyping phase, I had a much richer and more grounded set of points than I started with, informed by real experience rather than just upfront planning.
AI was then used to help me articulate and write up these learnings clearly. The ideas, discoveries, and direction remained my own; AI served as a tool to help express them effectively.
Just as a car doesn't choose the destination the driver does AI here was simply the vehicle that helped me get there faster. It accelerated my workflow, but the thinking, the decisions, and the vision behind this project were entirely mine.
Introduction
My name is Ashutosh Singh and I am currently pursuing a Bachelor's degree in Computer Science at the Indian Institute of Information Technology (IIIT) Lucknow. I secured admission through the Joint Entrance Examination (JEE Main), one of the most competitive engineering entrance examinations in India, achieving an All India Rank of approximately 8,500 among more than 800,000 candidates.
more about me you can find me at end
Why Joplin
Every tool I use daily from my browser (Zen) to my IDE (Zed) is open source. That's not a coincidence; it's a deliberate choice. Better note-taking and scheduling has always been a priority for me, and I've tried plenty of options: Notion, Sunsama, Todoist. They all worked, but none of them were open source, and none of them ever will be.
When I came across Joplin in September 2025, I downloaded the mobile app immediately. What struck me first was something simple: it runs everywhere. But the more I used it, the more I appreciated what it actually stood for your notes, your machine, your data.
Open Source & Development Experience
Extralit — Open Source Contribution (v0.4.0 Release)
- Contributed to the official v0.4.0 release of Extralit — credited as a key contributor alongside the project maintainer.
- Co-authored PR #57 — a comprehensive overhaul of the Extralit CLI, migrating the entire command structure from Argilla V1 to V2 using Typer, with modular command modules for datasets, users, workspaces, schemas, files, and documents.
- Implemented full CRUD support for workspace schemas including Pandera-based serialization, versioning, and dataset sharing via CLI and Python API.
Tech Stack: Python Typer Argilla V2 Pandera
v0.4.0 Release PR #57
Vercel Open Source Program: Vengeance UI
Vercel Open Source Program: Winter 2026 cohort
- Vercel Open Source Program: Winter 2026 cohort
- Engineered reusable React + TypeScript components and an MDX-based documentation platform with interactive previews.
- Scaled to 15,000+ monthly users and grew the project to 600+ GitHub stars, with external community contributions (37 Forks).
- Backed by Vercel's Open Source Program, recognizing the project's impact and community adoption.
Tech Stack: TypeScript Next.js Tailwind CSS Framer Motion Model Context Protocol
KDE
- Built QML/JavaScript-based dataset editors for multiple GCompris activities, enabling creation and validation of fixed and randomized datasets.
- Refactored legacy dataset formats into a unified, extensible schema, reducing parsing complexity and long-term maintenance overhead.
- Designed reusable QML UI components and implemented editor-level validation.
Tech Stack: QML JavaScript
Industry & Research Experience
C4GT/DMP 2025 — Beckn (May 2025 – Aug 2025)
- Unified vector databases for 100K+ embeddings, enabling sub-150ms semantic search latency.
- Improved query accuracy by 70% through ranking optimization and intent recognition.
- Built an ETL pipeline processing 10K+ records/hour with 85% noise reduction.
- Developed an AI platform to track 100+ Indian Constitution amendments using NLP-based summarization.
Tech Stack: Python NLP Vector Databases ETL
SuperKalam (YC 23) (September 2025 – December 2025)
- Improved retrieval and semantic search quality by refining ElasticSearch indexing, embedding generation, and Qdrant schemas, resulting in a 20% lift in search relevance metrics.
- Reduced inference costs by 35% through systematic model migration from OpenAI to Vertex AI Gemini.
- Built and maintained LLM evaluation systems to benchmark response quality, grounding accuracy, latency, and cost tradeoffs.
Tech Stack: Python TypeScript ElasticSearch Qdrant Vertex AI OpenAI
Project Summary
The Problem
When I started researching this project, I typed "how to restart postgres" into Joplin's search bar, expecting it to find my note titled "PostgreSQL service management." It didn't. Zero results. The note doesn't contain the word "restart" it says "systemctl start" and "service recovery." FTS4 searches for exact tokens, and there was no lexical overlap between my query and my note.
I then built a fully working prototype and tested it on 503 real notes. Searching "chicken recipe" instantly returns recipe notes ranked at 97% relevance with keyword highlighting even though the notes are titled things like "Recipe - Pasta Carbonara" and the word "chicken" only appears in the ingredient list. Searching "how much did I spend" surfaces budget review notes from a year ago. This is the experience Joplin users deserve.
The idea page gives three examples that show this exact problem:
- "It's a note about a meeting with a company from Germany in 2020 or maybe 2019"
- "I need the list of tasks I wrote for the website redesign"
- "Look for the note with the poetry lines about the moon"
None of these contain actionable keywords for FTS4. All of them are about meaning, not exact words. This is what I will fix.
What I'll Build
A desktop Joplin plugin that adds an AI-supported search mode using local embedding models and hybrid retrieval. The plugin supplements Joplin's existing search ,it does not replace it.
The user opens a sidebar, types a natural-language query, and receives ranked result cards. Each card shows the note title, the best-matching passage with highlighted terms, and the heading path within the note. Clicking a card navigates to that note. All inference runs locally by default.
How This Differs from Jarvis
I studied Jarvis carefully before starting this project. Jarvis is an excellent AI assistant, but it solves a different problem:
| Dimension | Jarvis | This Project |
|---|---|---|
| Primary function | AI assistant (summarize, chat, complete) | Note retrieval (find the right note/section) |
| Retrieval | Embeddings only | Hybrid: FTS4 + vector, RRF fusion |
| Chunking | Whole-note or fixed-size | Structure-aware: heading-aligned |
| Default runtime | Requires API key | Local-first: no API key needed |
| Result UX | Inline assistant responses | Dedicated search sidebar with ranked cards |
| Scope | Broad (write, chat, summarize, search) | Narrow (search only but does that thing well |
The key technical difference is hybrid retrieval. I experimented with pure semantic search and found it misses exact keywords that FTS4 handles perfectly. Someone searching for iptables expects their iptables note to appear pure semantic search might rank a "firewall configuration" note higher. Combining both signals via Reciprocal Rank Fusion (RRF) consistently outperforms either approach alone.
Technical Approach
Why a Plugin (Not a Core Change)
This was the first design decision I had to make. I could either modify Joplin's existing search bar (SearchEngine.ts) or build a separate plugin with its own sidebar.
I chose a plugin because:
- joplin.views.panels provides full sidebar UI with HTML/CSS/JS
- JoplinData provides note access without auth tokens
- joplin.plugins.dataDir() gives persistent local storage for the vector database
- joplin.settings.registerSection() with secure: true handles API keys in the OS keychain
Zero core changes. The plugin can be installed or removed without affecting Joplin. If the approach doesn't work well, the user just uninstalls it.
Architecture
Chunking Strategy
Before I explain the chunking, I want to explain why it's necessary. BGE-small has a 512-token limit (roughly 380 words). If a note is 1500 tokens long, the model silently ignores everything after token 512. There is no error, no warning. For search, this means any important information in the second half of a long note is completely invisible.
At first I tried the simplest approach: embed the entire note as one piece. The problem was that long notes contain multiple topics. A note with sections on "backup configuration" and "monitoring alerts" produces one vector that represents neither topic cleanly. When the user searches for "backup", this note gets a mediocre score instead of a strong match on the backup section.
So I split on heading boundaries (# through ######). Each chunk gets the heading breadcrumb prepended:
Why heading-aligned, not fixed-size:
Fixed 400-word chunks split mid-paragraph. A sentence about backup configuration might end up in a chunk that starts with monitoring code. The embedding captures neither topic. Heading-aligned chunks keep topics intact, and the breadcrumb prefix gives the embedding model context "configuration" in the context of PostgreSQL embeds differently than "configuration" in the context of Nginx.
Parameters and why I chose them:
| Parameter | Value | Why |
|---|---|---|
| Max tokens per chunk | 350 | BGE-small's limit is 512. Title prefix averages ~30% of context. 350 prevents silent truncation. |
| Overlap | 50 words | Cross-section context without wasting tokens. Less → missed boundary concepts. More → too much duplication. |
| Min chunk size | 3 tokens | Filters empty headings and whitespace-only sections. |
| Content hash | djb2 (fast) | Change detection for incremental re-indexing. Not security — just "has this chunk changed?" |
Structural rules I implemented (from 12 core tests):
- Never split mid-code-block (tracks ``` open/close state)
- Preserves markdown table content as one unit
- Strips markdown syntax (#, *, **) but keeps the semantic text
- Handles all 6 heading levels with proper breadcrumb nesting
What the tests cover (12 passing in packages/lib/services/embedding/ChunkingEngine.test.ts):
✓ returns empty for empty note
✓ single chunk for short note
✓ splits at heading boundaries
✓ preserves heading hierarchy in breadcrumb
✓ strips markdown formatting
✓ content hash is deterministic
✓ filters very short chunks
✓ handles code blocks without splitting inside them
✓ includes notebook id
✓ handles multiple heading levels
✓ RRF fusion formula
✓ hybrid docs score higher than single-source
Embedding Model (What I Tried and Rejected)
Before settling on BGE-small, I evaluated several models and rejected most for concrete reasons:
all-MiniLM-L6-v2 (23 MB, 384-dim): My first instinct is that it is small, fast, and widely used. But it has a 256-token maximum. Everything after token 256 is silently truncated. No warning, no error, it just ignores the text. Since my chunks are up to 350 tokens (heading prefix + body), MiniLM would lose roughly 30% of every chunk's content. I verified this by generating embeddings for a 400-word passage and comparing it against a 200-word passage the embeddings were identical, confirming truncation.
BGE-large-en-v1.5 (1.3 GB, 1024-dim): Better quality +4 MTEB retrieval points over BGE-small. But 10× larger. A 1.3 GB model download on first run would deter most users, especially those on limited bandwidth. The quality gain did not justify the size increase.
nomic-embed-text (274 MB, 768-dim, via Ollama): Excellent model, but requires Ollama running as a separate process. Cannot be bundled as WASM. I made it an opt-in provider, not the default.
| Model | MTEB Retrieval ndcg@10 | Size | Token Limit | Verdict |
|---|---|---|---|---|
| all-MiniLM-L6-v2 | 41.95 | 23 MB | 256 | ✗ Silent truncation at 256 tokens |
| BGE-small-en-v1.5 | 51.68 | 127 MB | 512 | Selected |
| BGE-large-en-v1.5 | 54.29 | 1.3 GB | 512 | ✗ 10× too large for default install |
| nomic-embed-text | N/A (Ollama) | 274 MB | 8192 | Opt-in (requires Ollama) |
Why MTEB Retrieval score, not overall MTEB
The MTEB leaderboard averages 8 task types (retrieval, clustering, classification, STS, etc.). A model that is excellent at summarization but poor at retrieval can still rank highly overall. For this project — which is about finding notes by meaning — only the retrieval ndcg@10 score matters. I filtered MTEB specifically for retrieval tasks, not the overall leaderboard.
Embedding Providers (The Abstraction Layer)
After choosing the model, I needed to decide how to run inference. I built three providers behind a common interface:
| Provider | How it works | Speed | Privacy |
|---|---|---|---|
| Ollama (default) | HTTP POST to localhost:11434 | ~120 notes/sec | Local |
| Local WASM | Transformers.js in-process | ~47 notes/sec* | Fully offline |
| OpenAI | HTTPS API | ~200 notes/sec | ✗ Cloud |
Adding a new provider means implementing 4 methods. No pipeline changes required.
The WASM memory problem
WASM throughput degrades from ~47 notes/sec to ~15 notes/sec after around 80 calls. This happens because WebAssembly linear memory grows during inference but never shrinks. The allocated memory keeps increasing with each call.
During my prototype testing, I measured throughput degradation firsthand: WASM inference starts at ~47 notes/sec but drops to ~15 notes/sec after around 80 calls due to WebAssembly linear memory growing but never getting released.
My solution (already implemented in the prototype at embeddings.ts line 64)
Pipeline Recycling Strategy
The pipeline is recycled every 80 calls to maintain performance and stability. Since all intermediate results are persisted in the vector store, no data is lost during recycling.
To determine the optimal recycling interval, I will conduct benchmarks during the Community Bonding phase. This will involve measuring throughput across 500 notes using different intervals (40, 60, 80, 100, 120) and selecting the configuration that provides the best performance.
BGE-small Query Prefix Optimization
The BGE-small model recommends adding a query-specific prefix to improve retrieval performance. Accordingly, the embedForQuery() method prepends the following string to each query:
"Represent this sentence for searching relevant passages: "
This detail is not prominently documented and was identified by reviewing the model card on Hugging Face directly.
Vector Storage with sql.js
The plugin needs somewhere to store embeddings persistently. I evaluated several options:
| Option | Type | Desktop? | Mobile? | Scalability |
|---|---|---|---|---|
| sqlite-vec | Native module | Possible* | ✗ No native modules | Great |
| Vectra | JSON flat files | ✗ Poor (O(n) full scan, ~15 MB for 1K notes) | ||
| FAISS/ChromaDB | External server | ✗ Requires separate process | ✗ | Great |
| sql.js | WASM SQLite | Pure WASM | WASM | Good |
I originally considered sqlite-vec for proper vector indexing. After investigating PluginRunner.ts (line 121-127), I confirmed that desktop plugins have nodeIntegration: true so native modules ARE possible on desktop. However, on mobile, sqlite3 and fs-extra aren't available. Using native modules would limit the plugin to desktop only.
I chose sql.js (pure WASM) because it works on both desktop and mobile without any native compilation. This keeps the door open for mobile support in the future, even though this GSoC focuses on desktop.
Custom cosine similarity function (from vectorStore.ts):
Since sql.js doesn't have built-in vector operations, I registered a custom function:
The schema (from my working prototype at vectorStore.ts)
All vectors are pre-normalized to unit length before storage, so dot product equals cosine similarity. This avoids computing magnitudes at query time.
Storage estimates (validated with real prototype):
| Notes | Chunks (~3/note) | Embedding storage | Total DB | Verified? |
|---|---|---|---|---|
| 100 | ~300 | 450 KB | ~1 MB | |
| 503 | ~1,500 | ~4.5 MB | 6.1 MB | Measured |
| 1,000 | ~3,000 | 4.5 MB | ~10 MB | |
| 5,000 | ~15,000 | 22.5 MB | ~50 MB |
The 503-note measurement is from my working prototype: 108 technical notes + 395 personal notes (meetings, recipes, journal entries, fitness logs, finance reviews, travel plans) indexed through Ollama nomic-embed-text in ~45 seconds, producing a 6.1 MB SQLite database.
Persistence: The database lives in-memory (WASM) and flushes to disk every 30 seconds and on shutdown. The file is stored at joplin.plugins.dataDir(). On next launch, the database is loaded from disk no re-embedding needed.
Hybrid Retrieval Pipeline
This is the core contribution of this project, and the thing I spent the most time getting right.
Why RRF, not score normalization
My first approach was to normalize both scores to [0,1] and do a weighted sum. The problem is that cosine similarity and BM25 have completely different score distributions. Cosine ranges from 0.3 to 0.9 for text, and everything clusters near 0.7. BM25 scores are unbounded and depend on collection size. Any fixed normalization scheme breaks when the collection grows. RRF (Reciprocal Rank Fusion) avoids this entirely. It uses rank position, not score values:
score(note) = Σ 1/(k + rank_i) where k = 60
A note that appears at rank 3 in vector search and rank 5 in keyword search gets: 1/(60+3) + 1/(60+5) = 0.0159 + 0.0154 = 0.0313. A note in only one list gets only one term. Notes in both lists are naturally boosted no tuning needed.
Query-adaptive routing
Not all queries benefit equally from both paths. Short exact terms like "iptables" should weight FTS4 higher, while natural language queries like "how do I manage firewall rules" should weight vectors higher. I will implement a lightweight query classifier:
- Contains Joplin syntax (tag:, notebook:, created:) → pass through to FTS4 directly
- 1–2 exact tokens → boost FTS4 weight in RRF (k=30 for FTS4, k=60 for vector)
- 3+ words, natural language → standard RRF (k=60 for both)
I chose k=60 because that's the standard value from the original RRF paper (Cormack et al., SIGIR 2009). It's been validated across many benchmarks and there's no strong reason to change it for this use case.
RSE (Relevant Segment Extraction)
After RRF ranks the notes, I merge adjacent high-scoring chunks from the same note into coherent passages. If a user searches for "backup" and chunks 3, 4, and 6 all score well, RSE merges 3+4 into one segment and keeps 6 separate (gap > 2). The user sees the full backup procedure instead of sentence fragments.
From retrieval.ts line 91:
Incremental Indexing
The first time the plugin runs, it embeds all notes. That could take a few minutes for a large collection. After that, re-embedding everything every time would be a terrible experience.
I implement incremental indexing using three sync triggers that I discovered by reading Joplin's source code:
- immediate re-index when the user edits a note. But this is important. I discovered by reading JoplinWorkspace.ts (lines 115-128) that this event only fires for the currently selected note. It does not fire when notes are changed by sync or by other means. This almost derailed my sync design.
- Catches changes from other devices. When Joplin finishes a sync cycle, I check for new or modified notes.
- the reliable catch-all. The /events endpoint with cursor-based pagination gives me every change that happened since the last check. This is the mechanism that ensures nothing is missed.
Every chunk is stored with a content hash. On re-index, I hash the current note body and compare it to the stored hash. If they match, I skip that note entirely. On a typical day, less than 1% of notes change between syncs, so re-indexing is near-instant.
Plugin Settings
| Setting | Type | Default | Description |
|---|---|---|---|
| Provider | Dropdown | "ollama" | / "local" / "openai" |
| Ollama Endpoint | String | "http://localhost:11434" | URL for Ollama server |
| Ollama Model | String | "nomic-embed-text" | Embedding model name |
| API Key | Secure | "" | Stored in OS keychain via secure: true |
| Max Results | Integer | 10 | Results shown in sidebar |
| Auto-Reindex | Boolean | true | Re-embed on note change/sync |
Error Handling
I thought about what can actually go wrong and planned for each case:
| Failure | What the user sees | Why this approach |
|---|---|---|
| Ollama not running | Warning banner: "Start Ollama to enable AI Search" + install link | Don't crash, guide the user |
| Model not pulled | One-click: "Pull nomic-embed-text" button | Reduce friction |
| API key invalid (OpenAI) | Test on save → immediate error | Fail fast, not at index time |
| Large collection (5K+ notes) | Progress bar + cancel button + persistent partial progress | Never lose work |
| No results | "No semantic matches found. Try Joplin's search →" + link | Graceful handoff |
| Corrupt note content | Skip note, continue pipeline, list in "Skipped" section | Never stall on one bad note |
| DB corruption | on startup → one-click "Rebuild Index" | Rebuild rather than recover |
| Non-English notes | Joplin has a global user base |
Privacy & E2EE Handling
Everything is local by default. Nothing leaves the machine unless the user explicitly configures OpenAI.
When a remote provider is selected, the settings page shows a persistent warning: "Note content will be sent to [provider]. API keys are stored in your system keychain."
Embeddings are stored in the plugin's own data directory at joplin.plugins.dataDir() and are never synced to Joplin's main database. This matches Joplin's local-first philosophy.
E2EE notes are automatically excluded. The indexer checks the encryption_applied field on every note and skips encrypted notes entirely — the plugin never attempts to read encrypted content. When notes are decrypted after a sync cycle, the Events API cursor detects them as modified, and they are indexed normally. This is already implemented in the prototype's indexer.ts (line 66: if (note.encryption_applied) return;).
Practical Constraints
Indexing Time Estimates (validated with real prototype)
| Notes | WASM (BGE-small) | Ollama (nomic) | OpenAI | DB Size |
|---|---|---|---|---|
| 100 | ~8 sec | ~3 sec | ~2 sec | ~1 MB |
| 503 | — | ~45 sec ✓ | — | 6.1 MB ✓ |
| 1,000 | ~80 sec | ~30 sec | ~10 sec | ~10 MB |
| 5,000 | ~7 min | ~2.5 min | ~1 min | ~50 MB |
The 503-note measurement was taken on my local machine with Ollama running nomic-embed-text. Search latency is ~200ms per query (embed + vector search + FTS4 + RRF fusion + RSE).
UI responsiveness: Desktop plugins run inside a hidden BrowserWindow (packages/app-desktop/services/plugins/PluginRunner.ts, line 121: show: false, with nodeIntegration: true and contextIsolation: false). This means the plugin process is already separate from the main UI — plugin code does NOT block the Joplin interface. Additionally, Web Workers are available inside this BrowserWindow context, so a proper Web Worker will be used for embedding inference during GSoC.
This changes the approach: instead of the setTimeout(0) yield trick, a proper Web Worker will be used for embedding inference during GSoC. The worker handles the heavy WASM computation, and the main plugin thread stays responsive for panel UI and API calls. During Community Bonding, Worker availability in the plugin BrowserWindow will be validated and the performance improvement measured.
Known Challenges
- WASM memory degradation during large batch embedding: Already mitigated with pipeline recycling every 80 calls (see §3.6). Will validate exact interval during Community Bonding.
- ONNX/Transformers.js in Electron: SOLVED. This was identified as a risk in my initial prototype. The solution: Ollama HTTP is the default provider, which runs entirely outside the Electron sandbox via a simple fetch() call zero sandbox issues. The WASM provider (Transformers.js) is an opt-in alternative for fully offline use. My prototype successfully indexes 503 notes through Ollama in ~45 seconds with zero compatibility issues. This is not an open risk it is a solved problem.
- sql.js WASM loading: SOLVED. Already working in the prototype. sql.js loads its WASM binary correctly in the plugin’s BrowserWindow context. The 6.1 MB embedding database persists and reloads across Joplin restarts.
- Web Worker for embedding: Since desktop plugins run in a BrowserWindow with nodeIntegration: true, new Worker() is available. I will move embedding inference into a Worker so the heavy WASM computation runs on a separate thread, posting results back via postMessage.
Shared Infrastructure Compatibility
I am also proposing the Shared Infrastructure project (Idea 3a). If both projects are accepted:
- This plugin initially uses its own bundled pipeline
- The EmbeddingProvider and VectorStore interfaces match the shared API shape
- Migration = swap imports from ./embeddings to joplin.embedding.*
- No UI changes, no retrieval logic changes
- Per the GSoC dependency rules: “Can migrate some functionality to APIs provided by the ‘shared infrastructure’ project in the future”
If only this project is accepted, the plugin works completely standalone.
Agent-Based Search (Stretch Goal)
Beyond embedding-based retrieval, @shikuz proposed a fundamentally different approach: an LLM agent that reasons about which search tools to use rather than relying purely on vector similarity.
The idea: give the LLM descriptions of Joplin’s existing search capabilities as callable tools:
Example flow: User asks: “notes about the Berlin meeting from 2020”
- LLM reasons: this has both a semantic concept (“Berlin meeting”) and a date constraint (“2020”)
- LLM calls query_embeddings("Berlin meeting") → gets 10 candidates
- LLM calls filter_by_date("2020-01-01", "2020-12-31") → filters to 3 results
- LLM returns the filtered, ranked results.
This requires an LLM for reasoning (Ollama for local, OpenAI for cloud), but only for agent mode basic search works without any LLM. I will implement this as an optional “Smart Search” toggle in the sidebar during Weeks 7–8, using the same provider abstraction that already supports Ollama/OpenAI.
Why this matters: This directly addresses @shikuz’s observation that “embedding-based retrieval” and “agent-based search” are complementary, not competing approaches. Combining both in one plugin makes this project uniquely comprehensive.
UX Plan
Sidebar Layout
The search panel is a sidebar opened via the toolbar button or Ctrl+Shift+F.
Four states (all implemented and working):
| State | Display |
|---|---|
| Not Indexed | “Build a semantic index of your notes” with Build Index button |
| Indexing | Animated spinner + progress bar: “Indexed 342 of 503 notes” with real-time percentage |
| Ready | Query input at top, results below with score %, source badges, keyword highlighting |
| No Results | “No matches found. Try rephrasing your query” |
Key UX decision (learned from prototype): The HTML state is set server-side at startup if the index already exists, the search bar appears immediately without any loading screen. This eliminates the webview message-passing race condition that plagued early versions.
Result Cards
Each card shows (all implemented and working in prototype):
- Note title (bold, clickable → opens the note) with colored left border (green = high relevance, yellow = medium, gray = low)
- Relevance score as a percentage badge (e.g., 97%, 74%, 45%)
- Source badge: Semantic, Keyword, or Hybrid
- Section path (Configuration > Backup Schedule)
- Best passage snippet with keyword highlighting — matching query terms are highlighted with a warm yellow background
- Results header: shows count of results found (e.g., “5 results found”)
Interaction Flow
- User types query → 300ms debounce (avoids unnecessary embedding calls)
- Embed query → parallel FTS4 + vector search
- RRF fusion → RSE → render cards (~80ms total)
- Click card → joplin.commands.execute('openNote', noteId) → navigates to note
Theme Support
CSS uses Joplin's theme variables:
[image]
This means the sidebar automatically matches whatever theme the user has selected light, dark, or custom.
Accessibility
The sidebar includes ARIA attributes (role="search", aria-label, aria-live="polite" for result updates) to support screen readers. Keyboard navigation: Tab through results, Enter to open a note, Escape to clear the search. All interactive elements have unique, descriptive IDs.
First-Run Experience
- Plugin loads → sidebar (collapsed) appears, toolbar button added
- User opens sidebar → “Not Indexed” state with estimated indexing time
- Clicks “Build Index” → progress bar, cancel button, Joplin stays responsive
- Indexing completes → search input appears → user types first query
- Next launch → index loaded from disk → sidebar opens ready immediately (no loading screen, no message-passing delay HTML is generated server-side with the correct state)
Timeline
Community Bonding
- Optimize WASM recycling interval: benchmark 500 notes at intervals (40, 60, 80, 100, 120)
- Set up Web Worker for embedding inference (already confirmed Worker is available in plugin BrowserWindow)
- Discuss agent-based search design with @malekhavasi and @shikuz
Week 1–2: Core Hardening & Edge Cases
- Expand chunker for edge cases: LaTeX blocks, nested lists, code blocks with language tags
- Add cancel button for long-running index builds
- Implement query classifier (Joplin syntax → FTS4, natural language → hybrid)
- Increase test coverage from 42 → 55 tests
Week 3–4: Cross-Platform & Performance
- Cross-platform testing: Windows, macOS, Linux (prototype validated on Linux only)
- Performance benchmarks at 500, 2000, 5000 notes
- Web Worker integration for non-blocking embedding
- Memory profiling and optimization
Week 5–6: Reranking & Query Decomposition
- Cross-encoder reranking: re-score top-K results with a more precise model (planned per @shikuz's retrieval improvements)
- Query decomposition: split multi-part queries into sub-queries, merge results
- Integration tests on 503-note corpus with gold-standard queries
Week 7–8: Agent-Based Search
- Implement LLM tool-use interface:
- search_notes
- query_embeddings
- filter_by_date
- filter_by_tags
- Ollama-based local agent (no API key needed for basic agent mode)
- "Smart Search" toggle in sidebar UI
- Test with complex queries:
- "notes about Berlin from 2020"
- "tasks for the website redesign"
Week 9–10: Evaluation & Benchmarks
- Build 50-query gold set: exact terms, synonyms, paraphrases, vague descriptions
- Measure Recall@5, MRR, Precision@5, P50/P95 latency
- Compare: FTS4 only vs semantic only vs hybrid vs agent-based
Week 11–12: Documentation & Polish
- Plugin README: installation, configuration, privacy model
- Inline JSDoc for all public functions
- Demo screencast showing all search modes (basic, hybrid, agent)
- Submit to Joplin plugin repository
More about me
My motivation for this project comes from my personal journey and the experiences that shaped how I think about technology, learning, and design.
It started with a broken computer my school was discarding. I repaired it just to play games, but in doing so I unknowingly got my first real lesson in how hardware and systems work. That curiosity never left me. Growing up with an artist mother added a different dimension altogether. Being around her work shaped my instinct for creativity, visual design, and user experience in ways I did not fully realise until I started building software.
During school, a genuine interest in biology particularly in understanding the human brain eventually led me to artificial intelligence and machine learning. Learning that neural networks are inspired by how the brain processes information felt like two of my biggest interests finally making sense together.
When I began building and contributing to software, all of these interests started working together naturally. My systems knowledge helped me think about architecture and constraints, while my design instincts kept me focused on usability and the learner's experience. Over time, I developed a real appreciation for the constructionist philosophy: the idea that people learn best by making things.
This project sits right at the intersection of AI, software systems, and UI/UX
exactly the space I have been growing into. It is a direct expression of that philosophy, and feels like a natural continuation of the path I have been on.










