Weekly Update 6: User-centric summarization feature -> Letting users control summaries

HahaBill · 8 July 2024 00:52

1. Progress

UI/UX

In week 6, I have decided to fully focus on the UI/UX aspect of the AI summarisation feature. If you have any feedback or inputs, feel free to share!!

1.2 Created note summarization dialog

When users right-click on notes, the new dialog pops out. This new feature allows users to control the length, the choice of algorithms, and the content of summaries as shown in the video.

1.3 Created notebook summarization dialog

When users right-click on notebook, the new dialog pops out. This new feature allows users to control the length, the choice of algorithms, and number of notes to be summarised.

Users have option to:

Only summarise notes which selected notebook has (only immediate children)
Summarise all notes in a selected notebook
(in the future: have an option to allow users to pick which notes to summarise in a notebook)

In the upcoming weeks, I will implement the feature where users can click on the individual notes that they want to summarise and also able to edit them one by one if they want to.

Ideas:
-> project-tree like structure to visualise notes and notebooks in notebook dialog
-> or folder-like structure to visualise notes and notebooks in notebook dialog

AI

1.4 Dimensionality reduction

In the weekly updates post, few users (@BioFacLay and @Imperial_Squid) recommended to use dimensionality reduction for word2vec since it will improve the clustering performance of high-dimensional data.

There are several dimensionality techniques:

UMAP (umap-js - npm)
tSNE (tsne-js - npm)
PCA (GitHub - mljs/pca: Principal component analysis)

It is good to understand each of the dimensionality reduction techniques and their weaknesses.

1.5 More clustering algorithms

Furthermore, I've been also recommended to explore more clustering algorithms (especially density-based and hierarchical):

HDBSCAN
Mean Shift (used in computer vision, might not be good for textual data)

1.6 Topic Modelling

A technique to uncover hidden thematic structures in a large collection of documents.
-> it basically helps in identifying and categorizing main topics in text (e.g., LDA). Might be useful?

Sources:

Universal Web Worker in Joplin

In the past, there has been a problem with running Tesseract.js with .wasm instances in a plugin. The way it has been resolved is that it is integrated to the main Joplin app as a web worker:

github.com

laurent22/joplin/blob/62d514463c92e9828f2b6879cd5382f698251dc2/packages/app-desktop/app.ts#L356-L382


      
          	private setupOcrService() {
          		if (Setting.value('ocr.enabled')) {
          			if (!this.ocrService_) {
          				// eslint-disable-next-line @typescript-eslint/no-explicit-any -- Old code before rule was applied
          				const Tesseract = (window as any).Tesseract;
          
          				const driver = new OcrDriverTesseract(
          					{ createWorker: Tesseract.createWorker },
          					`${bridge().buildDir()}/tesseract.js/worker.min.js`,
          					`${bridge().buildDir()}/tesseract.js-core`,
          				);
          
          				this.ocrService_ = new OcrService(driver);
          			}
          
          			void this.ocrService_.runInBackground();
          		} else {
          			if (!this.ocrService_) return;
          			void this.ocrService_.stopRunInBackground();
          		}

This file has been truncated. show original

There has been a problem since the Bonding period of me running Transformers.js and ONNX in a plugin. They both contain .wasm instances. Therefore, the approach of creating another web workers in the main app might work.

Evaluation research / survey

This is still in discussion with @Daeraxa. We will come into a conclusion soon and upload the survey in Week 7.

Others

added logs with electron-log
if we are using unsupervised methods for extractive summarization, there is no need for creating UI pop windows to notify users about the summarization processes. Those might be useful for abstractive summarization and summating multiple notes or notebooks.

2. Plans

Firstly

creating another dialog when right-clicking on the 'Summarise the note' in EditorContextMenu with selected text in the editor.
in notebook dialog, visualize all the notes and notebooks where users can click on them and edit
- check how long it takes to summarise all the notes in a notebook and store it in JSON file -> might be useful to implement UI pop window to notify users about the summarization process
improving styles for dialogs
update [PENDING from week 6] GitHub README -> and then release the plugin
finish and submit midterm evaluation
uploading the survey for evaluation research

Afterwards

looking into creating a minimal web worker in Joplin to use Transformers.js, ONNX, etc.
- figure out whether we can generalise 'web worker' flow for other contributors
- it seems that Tesseract.js has built-in web worker. Therefore, we have to create our own 'web worker' flow
looking into dimensionality reduction and HDBSCAN and other clustering algorithms to understand the theory

3. Problems

[RESOLVED] Problem of making dialog to communicate with the plugin since there are not onMessage and postMessage events for Joplin plugin dialogs. Other users have came across with this problem before: Dialog <-> Plugin Communication
- Solution: pre-populate summaries in JSON file and create batch predictions to improve the performance

muzak · 8 July 2024 17:42

Good progress on this feature! Here are some ideas:

Features:

Explore a persistent panel option like the Search & Replace plugin for comparison with original note and/or repeated summarizing
Add text input for new note summary title
Add options to create a new note with summarized note or original note text in addition to in place summarizing
Make Ctrl clicking multiple notes prefill the multi note select option

Visualizing contents in notebook dialog:

The Inline Tag Navigator plugin uses badges.
The VS Code-style Search plugin uses a folder-like structure.
This article shows a number of different approaches (including the above): Improving the usability of multi-selecting from a long list | by Zina Szőgyényi | Tripaneer Techblog | Medium

Layout:

Add overall title for feature at top of interface
Move selected note or notebook name below overall title to give room for long names
For tall enough viewports or a persistent panel, move note summary section under options to reduce dead space

Other:

Expose keyboard shortcuts for all commands and show them in menu items
Check if or how multi note select with Ctrl clicking would work in All notes

HahaBill · 8 July 2024 18:25

Hi! Thank you so much for your input! Those links and ideas are super helpful!

Topic		Replies	Views
Summarize your notes with Joplin AI! Features	9	1986	25 March 2025
🤖 AI Summarization Plugin - Pre-Released First Version (v.0.1) 🚀 Summarize with AI gsoc-2024	12	521	5 August 2024
🤖 Summarise your notes and notebooks! - Released a New Version! (v.0.2.3) 🚀 Summarize with AI gsoc-2024	5	451	11 September 2024
Weekly Update 7: Posted a Survey and Added plugin panel with displaying notebook tree Summarize with AI weekly , report , gsoc-2024	0	51	15 July 2024
Bonding Period Update - Week 3 Summarize with AI	0	131	27 May 2024