File Uploader and OCR

@kellerjustin thanks for publishing your uploader! I'm currently giving it a try, and so far, it is very cool.

I have some questions, mainly regarding OCR - which is not part of your code, but maybe you have some hints for me :wink:

  • OCR in PDF seems to work on the first page, only. Is this intentional or a bug?
  • OCR appears to be more reliable with English texts. I have installed tesseract-ocr-deu for German text recognition, but it seems not to improve OCR when used with the file uploader. Do you know whether tesseract needs to "know" the language before OCR?
  • OCR of handwritten text is not very good. You are not, by any chance, aware of a way to get better handwriting recognition?

I would appreciate it if you had any ideas :slight_smile: on my OCR questions.

Cheers,
Sebastian

1 Like