Hi @shikuz,
Thank you so much for the detailed review! These are excellent questions, especially regarding the Electron sandbox constraints. Here is a breakdown of how I plan to handle each of those areas:
1. LanceDB, WASM, and the Electron Plugin Environment
You are completely right to call this out. Relying on native Node/Rust binaries inside Joplin's sandboxed plugin architecture is a recipe for cross-platform packaging issues. I just reviewed @HahaBill's summarisation plugin and saw how they ingeniously used Webpack's CopyPlugin to bundle the WebWorker/WASM files directly into the plugin's installationDir/dataDir to bypass Electron's strict sandbox restrictions.
I plan to replicate this exact Webpack bundling strategy for LanceDB's WASM bindings.
• Fallback Plan: If LanceDB's WASM build still proves brittle even with the Webpack workaround, my architectural fallback is to pivot the vector store to Orama (which is pure TS/WASM and optimized for the edge) or a WASM-compiled SQLite with vector extensions. This guarantees zero native-compilation headaches for end users.
2. Embedding Model & Storage
• The Model: I plan to use Xenova/all-MiniLM-L6-v2 via Transformers.js. It is lightweight (under 100MB), extremely fast for on-device WASM execution, and highly effective for document-level semantic search.
• Storage & Download: Transformers.js handles fetching from the Hugging Face Hub. I will override its default cache directory to point directly to the plugin's isolated storage using await joplin.plugins.dataDir(). This ensures the model is safely stored within Joplin's standard data structure and persists across app restarts.
3. The 8B Model Tier (Size & Installation UX)
• Size: A 4-bit quantized 8B model (e.g., Meta-Llama-3-8B-Instruct.Q4_K_M.gguf) is roughly 4.5GB to 4.8GB.
• Installation: Because silently downloading a 5GB file in the background is poor UX and could eat user bandwidth, the "Silent Switch" hardware check will only silently determine capability. The actual installation will require a one-time user confirmation. The React UI panel will display: "Your hardware supports High-Performance AI. Click here to download the local model (4.8GB)." The download will stream directly into the plugin's dataDir with a visible progress bar.
4. Validating Retrieval (Testing Strategy)
This is a great catch. Standard unit tests aren't enough for RAG. To validate that the retrieval is actually returning the right context, I will build an automated evaluation pipeline:
• The Golden Dataset: I will create a dummy collection of ~50 diverse Joplin notes (testing edge cases like long code blocks, nested lists, and Markdown tables) and map a set of predefined test questions to the exact note chunks that contain the answers.
• The Metrics: The test script will index the Golden Dataset, run the queries, and validate the results using Hit Rate (is the expected chunk in the top k=3 results?) and Mean Reciprocal Rank (MRR). If a tweak to the Markdown chunking algorithm causes the MRR to drop, the test suite will fail.
I will update the main proposal draft above to explicitly include this retrieval testing strategy and the WASM/UX clarifications. Thanks again for pointing me toward the summarization plugin repository!