I saw how the OCR with Tesseract works. Correct me if I am wrong:
On the high level, the way I think it works is that it initialises the web worker which is already provided Tesseract. It waits for users to upload any resources (.jpg, .pdf, etc.). Once the resource is uploaded, the worker put its idea into the task queue. That way, the asynchrony is achieved and the OCR will perform when the queue is pulled.
I am planning to implement the general web worker that would not only work for Transformers.js but also for other future packages that are not able to load in the plugin.
- Is there another way we can send note content from the plugin into the web app?
- If there are no ways, what do you think would be the best way?
- Someone who implemented the OCR/Tesseract, is my understanding of the whole system correct?