Fuzzy search using spellfix
Here’s how I think fuzzy search should work.
For each word, we’re going to use the spellfix extension to find its (three?) closest matches.
So if the search query is
important invention
spellfix(“importat”) -> “important”, “importance”, “implement”
spellfix(“inventin”) -> “invention”, “investigate”, “investment”
Note the increasing “fuzziness” from left to right.
Spellfix finds the closest match from the words indexed by FTS (giving frequently occurring words higher priority).
So the fuzzy search query becomes
important(0) importance(1) implement(2) invention(0) investigate(1) investment(2)
Where (n) represents the fuzziness level. So n==0 would mean an exact match.
Sorting
Once we get the notes satisfying this query, we need to sort it.
There are many ways to go about this.
Here’s my current plan. We sort the notes by “min fuzziness.”
So if a note contains an exact match, its “min fuzziness” would be 0 (the best we can get),
and so it should go on top of notes with “min fuzziness” 1 or 2 (there are no exact matches).
For the notes having the same “min fuzziness,” we sort based on the relevance score from Okapi BM25.
Also, we won’t be fuzzifying phrase searches.
Thoughts?