Scripts to remove orphaned resources

Please note that these scripts will become obsolete in the future, when the resource bugs are fixed!

A word of caution: I use the scripts myself and haven’t experienced any issues. However, use them at your own risk! It’s always a good idea to create a backup first. There’s also the --dry-run or -n option to see what happens without actually doing anything.

Use at your own risk!

Why are there 2 scripts to remove resources?

The first script jnrmor removes orphaned resources from the database and the meta data files for those resources from the sync target. Due to an error in the Joplin API, the actual resources are not deleted on the sync target.
There’s where script jnclnst comes to the rescue.

The default locations for the config files are the path of the script and your home directory. These locations will be shown with the option --help.

jnrmor - remove orphaned resources in Joplin

usage: jnrmor [-c CONFIGFILE] [-f] [-q|--quiet] [-n|--dry-run] [-d|--debug] [-V|--version] [-h] [--help]

       -c CONFIGFILE
           use CONFIGFILE, instead of searching the default locations
           The first file found is used.

       -f
           run without confirmation

       -q, --quiet
           do not print informational messages
           (errors will be shown)

       -n, --dry-run
           only show orphaned resources (do not actually delete them)
           implies -f

       -d, --debug
           print debug information

       -V, --version
           version information

       -h
           usage information

       --help
           this help

If you rather use a perl script, head over here. You can use the script listnotes.pl with the option --weed.

jnclnst - clean sync target (remove orphaned resources from sync target)

This script is no longer needed. A fix was added to Joplin a while back. It’s only here for reference and for people who still use an old Joplin version.
It doesn’t hurt to run the script, it just won’t find anything to remove anything anymore.

usage: jnclnst [-c CONFIGFILE] [-f] [-q|--quiet] [-n|--dry-run] [-d|--debug] [-V|--version] [-h] [--help]

       -c CONFIGFILE
           use CONFIGFILE, instead of searching the default locations
           The first file found is used.

       -f
           run without confirmation

       -q, --quiet
           do not print informational messages
           (errors will be shown)

       -n, --dry-run
           only show orphaned resources (do not actually delete them)
           implies -f

       -d, --debug
           print debug information

       -V, --version
           version information

       -h
           usage information

       --help
           this help

Repository

6 Likes

I needed to upgrade getopt and bash for macOS

This is a good resource for upgrading bash:

https://akrabat.com/upgrading-to-bash-4-on-macos/

Yea, unfortunately macOS 10.14.x comes with bash 3.x, which does not support associative arrays. I’m actually using bash5 on my machine. The getopt of macOS is one of the worst implementations ever, which is not compatible to any other getopt version out there.

Maybe I should add a remark to the README.

MacPorts

sudo port install bash
sudo port install getopt

Brew

brew install bash
brew install gnu-getopt

listnotes has a --weed option to remove unused resources.

Ah, I didn't know someone had already written a script that did that, otherwise I would not have written the first one. However, please note that due to a bug in the API, these resources are not removed from the sync target. That's why I wrote the second script jnclnst.

Interesting…

I needed to change “info-n” to “info_n” to run the script.

Any hints on what should go into the config file?

Sorry about that. Technically a dash is not valid, but works on most systems. I forgot about that. I'll update the scripts.

I'm not sure what you mean. There are .template files that explain the necessary settings.

That's the hint I was looking for :).

I considered .xxx.conf to be something related to your private (build?) environment so I didn't download them...

Using your script I found out that I not only have several stale resources, but I also have a lot of stale notes (some with resources) and other .md files. In the cloud I have 286 .md files, while a jex export contains only 75 .md files. Not taking into account the notes versions (112 .md files that are not included in the jex) this leaves 99 stale files.

Interestingly, the sync_items table in the Joplin database contains 276 items.

Do I have a gross database mixup?

Hmm, does this mean you have md files on your sync target which have ids other than select id from notes? If so, this is weird, but I see a new script coming. :wink:

I can't really answer that, but I don't think so. e.g. the sync_items table should have the same number of records as the Synchronization status says in the section Sync status (synced items / total items) for Total:.

Lets see… In the following table
type is the type (1 = note, 2 = folder, etc)
cloud means: $ grep 'type_:type$' *.md | wc -l
db means select count(*) from table
jex means grep from a JEX export.

type cloud db jex note
1 (notes) 50 50 48
2 (folders) 14 14 14
4 (resources) 19 11 11 See 1 below
5 (tags) 9 9 1 See 2 below
6 (notes_tags) 82 82 1 See 3 below
13 (revisions) 112 110 -

Notes:

  1. This is the known issue of the stale resources
  2. Looks okay, but, consistent with the JEX export, the notebook only uses 1 (one) tag.
    scrot20190709082316
    The other tags in the database (and the cloud) are left-overs from the Welcome pages, e.g. attachment, search, importing.
  3. Again, the cloud matches the database, but 81 entries are stale.

So it seems there is a little bit more involved for a decent cleanup…

I’m not sure, if I understood this correctly. So you are saying that you are using one tag, which is only set for one note?

In any case, there’s no way to delete tags in Joplin, you can only unassociate them from all notes in which case they no longer show up in the UI. Afaik ‘deleting’ tags was never implemented.

The database structure does not use foreign keys thus there’s no RI, which makes the maintenance and coding a bit more complicated. It’s also more error prone. But this might be a limitation of sqlite.

I’m currently not using any tags, because I can’t retrieve an intersection of tags in search. Therefore I have never looked into the inner workings of tags and their cleanup.

It was once, but since 2009 (yes, ten years ago!) sqlite allows foreign keys, and enforces referential integrity with PRAGMA foreign_keys=ON.

Yes I think Sqlite supports foreign keys and cascade delete, but it’s not implemented mostly because of sync.