Weekly Update 1: Laying the Pipeline Foundation (Chunking & Vector Caching)

Harsh16gupta · 1 June 2026 00:07

Hello everyone! Hope you all had a great week. I wanted to share a quick update on where we're at with the note categorization plugin. I've got some really exciting progress to share!

Progress from last week:

As planned last week, I worked on 3 PRs:

Chunking (#5): The token-based chunking and WebGPU/WASM fallback selection is done.
Vector Aggregation (#6): Opened the PR for averaging chunk vectors, filtering out generic titles, and blending titles with body vectors.
Caching: The local cache implementation using vectra and SHA-256 hashing is complete. I'm just finishing up testing and will open a PR soon.

Plan for this week:

Local Vector Caching & Incremental Indexing: Get the caching PR (currently in testing) opened, reviewed, and merged.
UMAP Integration with DruidJS: Set up DruidJS to project our averaged note vectors down into UMAP coordinates, ensuring we use a fixed random seed so the output coordinates are stable and reproducible.
UMAP Pipeline Verification: Run and test the full pipeline (embedding → note vector → UMAP) across different note collection sizes (small, medium, large) to check for stability and performance.

No major problem faced this week!

Topic		Replies	Views
Bonding Period update (week 3) + Some Coding Note Categorisation	0	89	24 May 2026
Weekly Update 4: React Panel Integration & Notebook Context Experiments Note Categorisation	0	50	21 June 2026
Weekly Update 5: Integrating Native AI & Cluster Tags Note Categorisation	6	180	3 July 2026
Week 9 Update: AI Pipeline Fixes, Semantic Cluster Naming & Clustering Refinements Note Categorisation	0	41	26 July 2026
Weekly Update 6: Self-Healing Cache, Auto-Naming & Inline Renaming Note Categorisation	0	66	5 July 2026

Weekly Update 1: Laying the Pipeline Foundation (Chunking & Vector Caching)

Progress from last week:

Plan for this week:

Related topics