File Uploader and OCR

Hi Justin,

That did the trick!

@bitbacchus Thanks for you help as well. I suppose your problem should be solved with python3.

Thanks for the quick responses and the OneNote deliberation :wink:

2 Likes

@kellerjustin Maybe you want to update the first post in this topic, because I think youā€™ve updated it quite a bit since then. e.g. PDFs should work and the following is not true anymore Seems the API only supports image files, not PDFs. or is it? Anyway, great script.

1 Like

Thanks! Good idea, but it appears I canā€™t edit itā€¦ ? I have an edit icon for recent posts but not for anything from June and olderā€¦

Ok, thatā€™s weird. I will have to look into it. Iā€™m not a discourse expert, so Iā€™ll have to investigateā€¦ You should always be able to edir your own posts. Hmmm.

Ok, try it now, maybe you have to logout and login again.

Whatever you did restored the edit functionality. Thanks! I updated the original post.

Any chance to run this on a windows machine:)? How?

Step 1: erase your Windows partition and install your Linux distro of choice... .. j/k :smile:

The github link has directions for Windows - it's a little trickier but it does work. The main difference is the initial requirement that you manually install poppler and tesseract. Ensure environment variables work for both so that you can type poppler or tesseract in a command prompt.

1 Like

Yes, it does work well under Windows.

Is there any manual available for the windows setup?

@iknow79 2 comments above it literally says: The github link has directions for Windows

I was able to use rest_uploader to get a server-based on-button scan to Jopplin. Please let me know how I can improve:

1 Like

Thanks, Steve. The workflow you describe is very similar to the one Iā€™ve used over the years - itā€™s why I built rest_uploader. I have a cheap network scanner, which uploads a scanned doc via FTP to a monitored directory on a computer running rest_uploader and Joplin. Iā€™m not using joplin-cli though I totally get what youā€™re saying. Unfortunately, Iā€™m not aware of a more streamlined way to accomplish it, but someone else in this thread may have a better idea!

I also started using rest_uploaded and find it very useful for my workflow.

Two things I would love for this

  1. Update already existsing notes
    at leat for *.txt files it would be nice the uploader recognize that an existing note changed (via checksum or whatever) and uploads that (modified) file again.
  2. Have some way to feed a note somewhere in the document structure in some existing (sub)notebook. For my understanding currently all notes are going into ā€œdocument-rootā€.

Hi resi,
Thanks for the feedback!
#1 - I donā€™t think itā€™d be terribly difficult to implement, but my concern would be if youā€™re a save-early-save-often person working on a file in the directory ā€“ youā€™d spam your upload notebook with a bunch of slightly modified files. Could get real ugly real quick.
#2 - You can change the upload notebook by modifying the settings.py file in the rest_uploader installation directory, which you should able to find in the site-packages directory of your python installation directory. In the future, this would be better as a command line option.

Iā€™ve modified it to use the folder that has the note most recently tagged with ā€˜hereā€™.

Hi kellerjustin,
thanks for your answer. Regarding #1 it's less a save-often thing, it is I am having some notes in my local filesystem (created otherwise, not w/ joplin) and I want them to be reflected in joplin as well, which is really nice as I do have them available also on mobile. But this notes might change - not necessarily very often - but then of course I want the newest available in joplin. Maybe this can be configurable to be turned on or off.
#2 I see, was not aware of this configuration. At least it would go to my "inbox" which is much better then "root" but one step further could be that the observed local directory is processed recursively and subdirectories become sub-notebooks in the joplin structure.
But maybe I'm asking for too much :slight_smile:
Nonetheless thanks for your work

1 Like

Hi, somehow this does not work for me neither on Ubuntu19 nor 18. Running rest_uploader seems to properly monitor and OCR the file but then basically stops. Joplin is up an running, clipping service is up and running and key properly entered. Somehow the procedere seems to stop and does not create any entry in Joplin, what am I missing here?

python3 -m  rest_uploader.cli /home/pete/Downloads
    Launching Application rest_uploader.cli.main
    Language: eng
    Monitoring directory /home/pete/Downloads for files
    created -- /home/pete/Downloads/Unbenannt 2.pdf
    {'id': 'a58afb99dac0488e83c6d668712f96de', 'title': 'Unbenannt 2', 'mime': 'application/pdf', 'filename': 'Unbenannt 2.pdf', 'created_time': 1578006467355, 'updated_time': 1578006467355, 'user_created_time': 1578006467355, 'user_updated_time': 1578006467355, 'file_extension': 'pdf', 'encryption_cipher_text': '', 'encryption_applied': 0, 'encryption_blob_encrypted': 0, 'size': 8014, 'type_': 4}
    <Response [200]>
    {"title":"Unbenannt 2","body":"Unbenannt 2.pdf uploaded from pete-VirtualBox\n[Unbenannt 2.pdf](:/a58afb99dac0488e83c6d668712f96de)\n<!---\n\n\n***PAGE 1 of 1*** \n\nTest 1234345rtty\n-->\n\n\n![44aeef3f02d4591b2aab77b543325913.png](:/e4e7ebcae27b4103a3904c9e5f49b418)\n\n","parent_id":"0","markup_language":1,"updated_time":1578006469089,"created_time":1578006468867,"source":"joplin-desktop","source_application":"net.cozic.joplin-desktop","id":"3345fc6bffaf45479b2636f24800d1fa","user_updated_time":1578006469089,"user_created_time":1578006468867,"type_":1}
    {'title': 'Unbenannt 2', 'body': 'Unbenannt 2.pdf uploaded from pete-VirtualBox\n[Unbenannt 2.pdf](:/a58afb99dac0488e83c6d668712f96de)\n<!---\n\n\n***PAGE 1 of 1*** \n\nTest 1234345rtty\n-->\n\n\n![44aeef3f02d4591b2aab77b543325913.png](:/e4e7ebcae27b4103a3904c9e5f49b418)\n\n', 'parent_id': '0', 'markup_language': 1, 'updated_time': 1578006469089, 'created_time': 1578006468867, 'source': 'joplin-desktop', 'source_application': 'net.cozic.joplin-desktop', 'id': '3345fc6bffaf45479b2636f24800d1fa', 'user_updated_time': 1578006469089, 'user_created_time': 1578006468867, 'type_': 1}

I donā€™t see an error message in the output - the default behavior is for newly created notes to go into a folder called ā€œinboxā€, and in the absence of an inbox folder, the results can be unpredictable - is it possible the note was created in a folder where you donā€™t expect it to be?

Great thanks, actually that was the issue, I should have created the inbox folder, without it the end up nowhereā€¦

Very cool, this was the missing piece for me to finally get rid of evernote, so thanks for your great work!

1 Like