About semantic search implementation in joplin

yahya94812 · 5 March 2026 11:43

Hey, I'm Yahya — yahya94812 (Yahya) · GitHub

I built semantic search for Joplin: GitHub - yahya94812/Semantic-Search: This project is for searching documents , notes semantically (not traditional text matching) · GitHub

It lets you find relevant notes by meaning rather than exact keywords. Here's how it works:

Generates embeddings for all notes using all-MiniLM-L6-v2
Stores them in a vector database
At search time, embeds the query and retrieves the most similar notes via vector similarity

Would love to hear if there are better approaches worth exploring!

executed · 5 March 2026 12:33

This by itself would great improvement when implemented in Joplin internal search.

For even more amazing results we need to utilize MD structure for text chunking. Each chunk should additionally have contextual information about how this chunk contributes to the upper hierarchy in scope of note, how it contributes to summary of whole note and also should contain short summary of chunks/links/images from same/other notes it references. Then these “rich” pieces of text go into Vector DB.

During the search we apply sparse+dense and also a re-ranker LLM to sort/filter based on relevance and finally regular LLM to decide the results subset that should be shown to the user.

This approach is not so demanding as existing GraphRAG or others architectures, but proved to be amazingly working for my local RAG setup.

yahya94812 · 5 March 2026 12:46

I think that for implementing an LLM-based reranker, we would need to use third-party LLMs through API keys. This can certainly be implemented, but I think it would be useful to keep it as an additional or optional functionality.

Embedding-based search is great for running local semantic search, especially for classic and basic users who may not want to rely on external APIs.

Your idea of implementing Markdown-conscious chunks is great, as it helps maintain the semantic meaning and metadata of blocks.

So I’m looking forward to implementing the experimental Markdown-conscious chunking feature.

Topic		Replies	Views
Proposal: A local-based semantic search engine Integration Features	1	225	6 March 2025
Joplin is let down by very poor search Features	11	1486	11 April 2021
Search functionality in Joplin Features	98	4943	19 October 2025
Advanced search Features	3	1978	1 October 2018
Because Evernote is hemorrhaging users Features	24	1422	17 January 2025

About semantic search implementation in joplin

Related topics