Tech Spec: Generic Web Worker in Joplin

Generic Web Worker in Joplin

1. Overview

Web worker allows concurrent processing in Javascript by running in different threads. The advantage of using web workers is that we can process heavy computations in the background without affecting the main thread of the application.

In this tech specs, we are introducting an implementation of the generic web worker in Joplin that would be easily extendable and library-agnostic.

NOTE: This is not the final solution, feel free to give your inputs :slight_smile:
FROM THIS TOPIC: Implementing a Web Worker functionality in Joplin

2. Problem description

The main problem is to run .wasm instances or use node-loader in the Joplin's plugin. For AI Summarization project, the problem lies in using Transformers.js, the Javascript implementation of transformers library in Python by HuggingFace.

3. Solution

In the past, that problem was solved using a web worker and a task queue with Tesseract.js for OCR.

The way it works is that whenever users upload resources (.pdf, .jped, etc.), then the message containing resource id is pushed to the task queue for OCR processing whenever the queue is polled.

Inspired from the established solution, we can create a generic web worker template, where we can extend, override and modify. The idea is that there is a Worker Controller Manager that initialises, checks and terminates worker controllers. Each worker controller has its own worker that sends and receive data from. Furthermore, it pushes request messsges from plugins or an application into its own task queues. Based on the concurrency, task queues would be eventualy polled and process the messages in the callback functions where it will send requests to workers.

4. Technical Solution

4.1 Flow Diagram

4.2 Worker Controller Template

// BaseWorkerController.ts

import TaskQueue from '<PATH>/TaskQueue';
import Logger from '@joplin/utils/Logger';	

class BaseWorkerController {
	private workerPath;
	private workerControllerTaskQueue;
	private logger;
	
	public constructor(workerPath: string, taskQueueName: string, loggerName?: string = "JoplinWorker") {
		
		if (new.target === BaseWorkerController) {
            throw new Error("Cannot instantiate an abstract class.");
        }
		this.workerPath = workerPath;
		this.logger = Logger.create(loggerName);
		this.workerControllerTaskQueue = new TaskQueue(taskQueueName, this.logger);
		}
	}
	
	async private addMessageTaskQueue() {
		throw new Error('Not implemented');
	}
	
	async terminateWorker() {
		throw new Error('Not implemented');
	}
	
	private makeQueueAction() {
		throw new Error('Not implemented');
	}
	private receiveWorkerData() {
		throw new Error('Not implemented');
    }
}

4.3 Worker Controller Manager

// WorkerControllerManager.ts

class WorkerControllerManager() {
	public static instance;	
	
	constructor() {
		if(this.instance) {
			return this.instance;
		} else {
			this.instance = new WorkerControllerManager();
		}
	}
	
	async start() {
		const workerControllers = Settings.value('workerControllers')
		for(const currWorkerController of workerControllers) {
			if(Settings.value(`${currWorkerController}.enabled`)) {
				this.createJoplinWorker(currWorkerController);
			}
		}
	}
	
	private startJoplinWorker(workerName: string) {
		switch(workerName) {
			case "<YOUR_WEB_WORKER": {
				// ExampleWorkerController example = new ExampleWorkerController();
			}
			default: {
				throw new Error('Unknown worker');
			}
		}
	}
}

4.4 Example of a Custom Worker Controller

// TransformersWorkerController.ts

class TransformersWorkerController extends BaseWorkerController {
	
	private name: String = "TransformersWorkerController";
	public static workerControllerInstance;
	public workerInstance;
	
	public constructor(workerPath: string, taskQueueName: string, loggerName?: string = "JoplinWorker") {
		super(workerPath, taskQueueName, loggerName);
		this.workerTaskQueue.setConcurrency(2);
		this.workerTaskQueue.keepTaskResults = false;
		this.transformersWorker = new Worker(workerPath);
		
		if(this.workerControllerInstance) {
			return this.workerControllerInstance;
		} else {
			this.workerControllerInstance = new TransformersWorkerController();
		}
	}
	
	public static instance() {
		if (this.workerControllerInstance) return this.workerControllerInstance;
		this.workerControllerInstance = new TransformersWorkerController();
		return this.workerControllerInstance;
	}
	
	async private addMessageTaskQueue(id: string, dataType: string, predictType: string, objectData: any) {
		await this.workerControllerTaskQueue.pushAsync(id, makeQueueAction(id, dataType, predictYpe, content, this.transformersWorker));
	}
	
	private makeQueueAction(id, dataType, predictType, objectData, transformersWorker) {
		transformersWorker.postMessage({id, dataType, predictType, objectData})
	}
	
	private receiveWorkerData() {
		this.transformersWorkers.onMessage = function(event) {
			// More code implementations...
		};
	}

}

4.5 Sending requests to Worker Controllers

For AI summarisation project, we can utilize Joplin Data API wih using userData attribute where we could store metadata about notes or notebooks. For example, we can send request to worker controllers with predictType equals to 'summarisation' and after the processing we can assign the values back to null or an empty string.

5. Testing Plan

We can use the existing testing framework that Joplin uses, namely Jest. We can test the flow in the diagram in section 4.1.

2 Likes

Thanks for the spec, it makes sense. Could you also please put some code example of how this plugin API would be used?

1 Like

Alright, I will!!

On desktop, it should be possible to create a web worker with the existing plugin API using something similar to

const worker = new Worker((await joplin.plugins.installationDir()) + '/path/to/worker.js');

where path/to/worker.js is generated from a TypeScript file included in the extraScripts entrypoint list.

I'll try to create an example plugin and report back. Edit: Example plugin and a related pull request.

Note: While the above approach should work on desktop, mobile plugins are run within about:srcdoc iframes. This may prevent them from loading workers in the installationDir. As such, for mobile and web, an API similar to the one presented above would be necessary.

2 Likes

Amazing! Thank you so much for giving me a hand and testing the worker in a plugin! I really appreciate that!

I've been trying to use your implementation and running the Transformers.js in my plugin, but I get this error:

plugin_org.joplinapp.plugins.AISummarisation.js:2 Uncaught TypeError: n.basename is not a function
    at Object.isDev (plugin_org.joplinapp.plugins.AISummarisation.js:2:401889)
    at e (plugin_org.joplinapp.plugins.AISummarisation.js:2:403177)
    at 20329 (plugin_org.joplinapp.plugins.AISummarisation.js:2:403812)
    at o (plugin_org.joplinapp.plugins.AISummarisation.js:2:11065329)
    at 22958 (plugin_org.joplinapp.plugins.AISummarisation.js:2:1742864)
    at o (plugin_org.joplinapp.plugins.AISummarisation.js:2:11065329)
    at 28156 (plugin_org.joplinapp.plugins.AISummarisation.js:2:1733660)
    at o (plugin_org.joplinapp.plugins.AISummarisation.js:2:11065329)
    at plugin_org.joplinapp.plugins.AISummarisation.js:2:11066077
    at plugin_org.joplinapp.plugins.AISummarisation.js:2:11066091

I think it is related to this:

{
  "extraScripts": ["ui/panel_react/index.tsx", "worker.ts"],
  "webpackOverrides": {
		"target": "web"
	}
}

with


const baseConfig = {
  mode: "production",
  target: "node",
  stats: "errors-only",
  module: {
    rules: [
      {
        test: /\.tsx?$/,
        use: "ts-loader",
        exclude: /node_modules/,
      },
    ],
  },
  ...userConfig.webpackOverrides,
};

However, your approach gave me the idea of doing word2vec!! I will try it; it should be easier since it does not depend on the node-loader. More on the paths to use the child process to execute C scripts.

Setting target: "web" for just the extraScripts might also work (if NodeJS libraries are needed for the main script). I think this would be done in the extraScriptsConfig object in webpack.config.js.

Note: Be careful using this as a permanent solution! Modifying webpack.config.js can make it difficult to update the plugin framework.

1 Like

It works!!

Q1: However, I always need to download .wasm files with copyAssets.js. I am thinking about downloading them beforehand and using CopyPlugin. Or is it possible to efficiently execute copyAssets.js after bundling?

Q2: Why does modifying webpack.config.js make it difficult to update the plugin framework? Is it because it gets overridden once there is a new webpack.config.js?

I don't think this will be the permanent solution, though, since I plan to create a generic web worker to load the Transformers.js and other libraries in the app. I still need to figure out how to best implement the plugin API for that so developers can easily create their own web workers.

Anyway, thank you for providing your implementation!! I really appreciate that!!

1 Like