So the HTML export from Nimbus is actually a zip export with HTML inside. The zip files have all of the images, css, and other files as separate files in the folder "assets". I don't know how to make Joplin ignore those directories when importing. Even when they only contain images, Joplin shows them. Maybe someone from Joplin developmnent could chime in to say how to hide those folders. I think you can delete the empty folders from Joplin display and images still seem to stay there.
For #1 there, I have no idea. There should be no PDFs involved unless you had an attachment or something . What did you mean exactly "PDF error on all - file source?" Do you mean when you run the script each note prints out an error ? I actually don't see an error. What you pasted there is just markdown content. Maybe you used to have a pdf attached to the note. If so maybe it didn't carry over.
Sure, the script is there if someone has a bunch of more basic notes they want to transfer over. It seems like the worst problem is the tables being garbled. Since like 70% of your note weren't converted well enough seemingly, I guess you use a lot of tables? If someone wants to mess with the command line options in the subprocess.run("pandoc" .... line then feel free. I couldn't find any options that converted both images and tables correctly. I probably won't make too many major updates to the script if you're ok with just moving forward with Joplin. Sorry it didn't work out too well
It is weird I do not see anything in of the asset folders.
The pdf links come up when clicking on them as "file source error", so they do not display at all. Also confused by all the text that surrounds the links? Checked some originals pdf s in nimbus, they are not linked but embedded in to the note so even more confusing.
Your correct I did tend to use a lot of tables, not sure why as each item could have been its own note, just another of my idiosyncrasies.
I have started the move to Joplin and very much plan to continue, love the idea of local storage on Nextcloud. A few weeks back I also moved from web based password tools to KeypassXC, works great.
You managed to get me well on the way with 30% which I still see as a win, so many thanks for that.
All the very best to you
Chris
Okay thanks. Yeah it seems like pandoc tries to embed images in base64 encoding sometimes or that's how some of the original notes were. I haven't seen the PDF thing so I don't know what is going on there. Do you mean the assets folders have nothing in them when viewed by Joplin or when viewed in your file explorer? If so nimbus might just make an empty assets folder in their zip which gets extracted, but i have code to delete directories with no files in them. If you just mean empty in joplin (or even if it's empty in both places) you can right click and delete those directories. It doesn't hurt anything
1 Like
Hi,
I've tried running your script on MacOS Monterey. Installed Pandoc with recommended extensions via
brew install pandoc
brew install librsvg python homebrew/cask/basictex
script hangs on creating md output with the following log
Writing markdown to //Documents/backup/converted/export-2022-01-15_9-55-001/All Notes/My Notes/note.md
^CTraceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/Users/sjorsjanssen/.vscode/extensions/ms-python.python-2021.12.1559732655/pythonFiles/lib/python/debugpy/main.py", line 45, in
cli.main()
File "//.vscode/extensions/ms-python.python-2021.12.1559732655/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 444, in main
run()
File "//.vscode/extensions/ms-python.python-2021.12.1559732655/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 285, in run_file
runpy.run_path(target_as_str, run_name=compat.force_str("main"))
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 269, in run_path
return _run_module_code(code, init_globals, run_name,
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "//Documents/backup/convert.py", line 161, in
write_note(html_note, markdown_destination)
File "//Documents/backup/convert.py", line 99, in write_note
pandoc_run = subprocess.run(
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/subprocess.py", line 503, in run
stdout, stderr = process.communicate(input, timeout=timeout)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/subprocess.py", line 1149, in communicate
stdout, stderr = self._communicate(input, endtime, timeout)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/subprocess.py", line 2000, in _communicate
ready = selector.select(timeout)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/selectors.py", line 416, in select
fd_event_list = self._selector.poll(timeout)
KeyboardInterrupt
Any ideas what I'm doing wrong?
Hi,
edited your code on line 103
shell=true
into
check=true
ran it, everything workded out fine
1 Like
Glad it was useful for someone