Mybase to joplin - sqlite too large!

I've been looking into migration from mybase 7/8 to Joplin. I just like Joplin's web clipper better, and it's overall more modern look & feel.

Mybase's 8 export to directory tree has been good enough to get my data out, after some additional folder renaming and tweaking. My usage has been mostly of saved web pages, in snippets or as a whole.

But here's the unexpected caveat. So far have managed to import the contents of 3 mybase db's (other 11 to go). That's about 450 sub-notebook/web entries into Joplin so far. The size of these 3 nyf files is just ~0.5GB, but in Joplin, database.sqlite has bloated to ~10GB! That's unsustainable, especially if also synchronizing, and will make Joplin sluggish at some point, and feeble with such a large single point of failure.

Have already tried the VACUUM command, but database.sqlite shrinked with almost nothing.

Skimming now at the db structure:

  • The deleted_items table has 8560 entries. Why are those still there after vacuum-ing? How to get rid of them?
  • The notes_fts_segment has 153978 entries. What are those and how to shrink them?
  • The other tables look normal given the number of entries (up to ~800). Some tables still load rather slow (DB Browser)

Is there something I can do to shrink sqlite size in a substantial way, or simply put Joplin isn't suitable for my kind of usage?

Thanks.

It's only a problem if the app can't handle it and most likely it should be fine. You'll only have actual problems if you have thousands of tags because the app is currently slow to display them.

As for SQLite, it supports up to 280 TB of data so you still have some margin.

Laurent, technically, I understand SQLite may handle a lot, but as an ordinary Joplin user it is inherently impractical and wasteful to me to keep sync of 10GB of storage for a mere 450 of web pages and snippets. That's a 20-fold increase from the other app's storage needs, for just keeping the same actual content. I'm now looking into the costs of hundred(s) of GB of cloud storage for things that are actually taking less than 3 GB in my local disk. That isn't an overhead I'm comfortable with no further reasoning. Hope in the future the team my find better ways of storing web pages.

EDIT: the size of the actual content before importing it into Joplin is 1.1GB. Not sure where the other 9GB come from. Hence the question for hints to optimize.

notes_fts is full-text search index. It won't sync so its size should not be a problem.

deleted_items is used for sync, vacuuming won't do anything to them. It's a bit strange though to have anything there assuming you've started from a clean state and only imported notes.

1 Like

Thanks, Roman. I started clean, but made many delete/import attempts until getting the import right.

Runned sqlite3_analyzer database.sqlite and got some interesting results:

*** Table ITEM_CHANGES and all its indices *******************************************
Percentage of total database......................  33.8%    
Number of entries................................. 4396      
Bytes of storage consumed......................... 3387113472
*** Table NOTES and all its indices *******************************************
Percentage of total database......................  11.7%    
Number of entries................................. 884       
Bytes of storage consumed......................... 1168904192
*** Table NOTES_FTS_SEGMENTS **************************************************
Percentage of total database......................   9.0%    
Number of entries................................. 153978    
Bytes of storage consumed......................... 899518464
*** Table NOTES_NORMALIZED and all its indices ********************************
Percentage of total database......................  11.7%    
Number of entries................................. 5304      
Bytes of storage consumed......................... 1168953344
*** Table REVISIONS and all its indices ***************************************
Percentage of total database......................  33.8%    
Number of entries................................. 5726      
Bytes of storage consumed......................... 3387277312

At glance, the bulk of the storage seems related to things that shouldn't concern my kind of usage (i.e. saving web pages). The question is now, how to slim ITEM_CHANGES, REVISIONS (and perhaps also NOTES_FTS_SEGMENTS) way down?

You can export all your notes to JEX, then wipe your data, and import back.

Still I'm surprised you have any note revisions and item changes. Have you used Joplin before importing your notes?

1 Like

No, never, but made many import > delete all entries > import again attempts, while trying to get the import right. Also got Joplin unresponsive a couple of times and reinstalled twice. Let me try the JEX export and start clean to see what may improve. Thanks!

Yeah, that will be the culprit. I'm pretty sure Joplin installs with note history enabled by default so all of those imports will be remembered in the note revisions.

2 Likes

Thanks Daeraxa and roman_r_m, both suggestions were useful!

Exported my stuff to JEX and the archive clocked up at 1.1GB, which is about the exact size of the stuff exported from mybase. Removed Joplin and C:\Users\USER\.config\joplin-desktop\*.* that includes database.sqlite; reinstalled fresh and imported the JEX archive back. The import finished with database.sqlite at about 1.8GB, which is quite an improvement, but it continued to grow afterwards and it's now at 2.8GB. Hope it stays there, I need no more indexing.

Daeraxa: note history = enabled seems indeed like the culprit why my db bulked up so much. I've disabled it for now. Thanks for the tip!

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.