Generic Web Worker in Joplin
1. Overview
Web worker allows concurrent processing in Javascript by running in different threads. The advantage of using web workers is that we can process heavy computations in the background without affecting the main thread of the application.
In this tech specs, we are introducting an implementation of the generic web worker in Joplin that would be easily extendable and library-agnostic.
NOTE: This is not the final solution, feel free to give your inputs
FROM THIS TOPIC: Implementing a Web Worker functionality in Joplin
2. Problem description
The main problem is to run .wasm
instances or use node-loader
in the Joplin's plugin. For AI Summarization project, the problem lies in using Transformers.js
, the Javascript implementation of transformers
library in Python by HuggingFace.
3. Solution
In the past, that problem was solved using a web worker and a task queue with Tesseract.js
for OCR.
The way it works is that whenever users upload resources (.pdf, .jped, etc.), then the message containing resource id is pushed to the task queue for OCR processing whenever the queue is polled.
Inspired from the established solution, we can create a generic web worker template, where we can extend, override and modify. The idea is that there is a Worker Controller Manager that initialises, checks and terminates worker controllers. Each worker controller has its own worker that sends and receive data from. Furthermore, it pushes request messsges from plugins or an application into its own task queues. Based on the concurrency, task queues would be eventualy polled and process the messages in the callback functions where it will send requests to workers.
4. Technical Solution
4.1 Flow Diagram
4.2 Worker Controller Template
// BaseWorkerController.ts
import TaskQueue from '<PATH>/TaskQueue';
import Logger from '@joplin/utils/Logger';
class BaseWorkerController {
private workerPath;
private workerControllerTaskQueue;
private logger;
public constructor(workerPath: string, taskQueueName: string, loggerName?: string = "JoplinWorker") {
if (new.target === BaseWorkerController) {
throw new Error("Cannot instantiate an abstract class.");
}
this.workerPath = workerPath;
this.logger = Logger.create(loggerName);
this.workerControllerTaskQueue = new TaskQueue(taskQueueName, this.logger);
}
}
async private addMessageTaskQueue() {
throw new Error('Not implemented');
}
async terminateWorker() {
throw new Error('Not implemented');
}
private makeQueueAction() {
throw new Error('Not implemented');
}
private receiveWorkerData() {
throw new Error('Not implemented');
}
}
4.3 Worker Controller Manager
// WorkerControllerManager.ts
class WorkerControllerManager() {
public static instance;
constructor() {
if(this.instance) {
return this.instance;
} else {
this.instance = new WorkerControllerManager();
}
}
async start() {
const workerControllers = Settings.value('workerControllers')
for(const currWorkerController of workerControllers) {
if(Settings.value(`${currWorkerController}.enabled`)) {
this.createJoplinWorker(currWorkerController);
}
}
}
private startJoplinWorker(workerName: string) {
switch(workerName) {
case "<YOUR_WEB_WORKER": {
// ExampleWorkerController example = new ExampleWorkerController();
}
default: {
throw new Error('Unknown worker');
}
}
}
}
4.4 Example of a Custom Worker Controller
// TransformersWorkerController.ts
class TransformersWorkerController extends BaseWorkerController {
private name: String = "TransformersWorkerController";
public static workerControllerInstance;
public workerInstance;
public constructor(workerPath: string, taskQueueName: string, loggerName?: string = "JoplinWorker") {
super(workerPath, taskQueueName, loggerName);
this.workerTaskQueue.setConcurrency(2);
this.workerTaskQueue.keepTaskResults = false;
this.transformersWorker = new Worker(workerPath);
if(this.workerControllerInstance) {
return this.workerControllerInstance;
} else {
this.workerControllerInstance = new TransformersWorkerController();
}
}
public static instance() {
if (this.workerControllerInstance) return this.workerControllerInstance;
this.workerControllerInstance = new TransformersWorkerController();
return this.workerControllerInstance;
}
async private addMessageTaskQueue(id: string, dataType: string, predictType: string, objectData: any) {
await this.workerControllerTaskQueue.pushAsync(id, makeQueueAction(id, dataType, predictYpe, content, this.transformersWorker));
}
private makeQueueAction(id, dataType, predictType, objectData, transformersWorker) {
transformersWorker.postMessage({id, dataType, predictType, objectData})
}
private receiveWorkerData() {
this.transformersWorkers.onMessage = function(event) {
// More code implementations...
};
}
}
4.5 Sending requests to Worker Controllers
For AI summarisation project, we can utilize Joplin Data API wih using userData attribute where we could store metadata about notes or notebooks. For example, we can send request to worker controllers with predictType
equals to 'summarisation' and after the processing we can assign the values back to null or an empty string.
5. Testing Plan
We can use the existing testing framework that Joplin uses, namely Jest. We can test the flow in the diagram in section 4.1.