OCR Support workarround

Thanks for clearing @CalebJohn my doubts and based on that I have made a workaround on how the OCR Support should be implemented in the Joplin.

I will be discussing the cases keeping in mind User Experience in Joplin for OCR Integration. Also please share your views on it as it will help me to write a good proposal for the project.

About Tesseract

This is a pretty good library for OCR and supports more than 100+ languages.

The time taken in the OCR process depends on the image

=> Size of image

=> Quality of image

=> Font of image

=> No. of words in the image

So, basically the process may take from seconds to minutes depending on the image.

Now, based on our discussion I have assumed few things and accordingly, I have planned to test the implementation of the OCR support in Joplin keeping in mind it does now hamper the User Experience

Let us say if am writing a note and want to extract the text from the image and use the text.

If we want to get the OCRed text in the same text area in which we are writing our note then,

Case 1: If the image is of good quality, good font size, fewer words then the OCR process will be done in seconds and also the confidence level of the OCR will be good.

Case 2: If the image is of bad quality, small font size, large numbers of words then the OCR process will take time and also the confidence level of the OCR will not be good and which means the OCRed text will contain lots of errors.

Now in Case 1, we can do the way we discussed. Uploading the image and show “…”

31c1c206bd2efc13059e498488f9560d.png

And after processing “…” will be replaced with the OCRed text

31c1c206bd2efc13059e498488f9560d.png

But this will not be helpful in Case 2, as it can take minutes to OCR the image and get the text also we are not sure the OCRed text is correct or not.

Now if we go deeper and study the user behavior then we will come to know how the OCR support should be implemented maybe I could be wrong but this is what I came to know. Also, we want the OCR support should be available with multiple images. So what if we do it this way,

Check the video

In this way, we have pretty good User Experience and also it will improve the productivity of the user in writing the note.

We will simply have an icon that will open the OCR window and then the user can add all the images and simply copy-paste the OCRed text also we have the dropdown to select the different language. And till the time the image is being OCRed then the user can write some notes and use it the OCRed text accordingly.

Please do give me your suggestions

2 Likes

why is this a workaround and not appended to the original discussion (sorry I hate jumping between topics)

Could you please tell me how I can append it ?

A post was merged into an existing topic: GSoC Idea - OCR Support