Homepage    |    GitHub    |    API    |    FAQ

Asking for tips for merging two similar notebooks from different source (that have diverged a bit)

Hi,

I seek the best way to merge two instances of same notebook that were identical till I had some synchronization problem (an old problem) and some divergence occurred between these instances. For some reason, one instance included the title in the note so that notes that are identical in meaning are not anymore exactly identical. I was looking for a very smart program that could merge those notes automatically, but I didn't found it and was not able to program it yet. If anyone have an idea. Please share it to me. I have about 5000 notes to be merged.
The last strategy I though of is to export both instances of the notebook as markdown, compare MD directory with meld. It allows to compare notes quite fast (compared to what I could do directly in joplin). I could see if two notes with the same name are identical in meaning and choose which one I should keep. For merging or deleting duplicated notes in joplin, I first search a phrase of the note in joplin in order to identify notes. This usually gives me two notes of each instance of the notebook and I can do the correct choice manually.
I thought, I initially could manage 5000 notes in about 2-3 weeks, but it seems that I would need maybe 5 month and now I looking for a better strategy!!!
Maybe I could do some step faster by using CLI or this kind of thing that I never used. Another possibility is that, I could live with those doublons without too much disturbance.

I'm open to any suggestions that may help me solve this problems.

I use joplin-desktop-2.3.5 on archlinux.

I don't know if that would be any easier but you could try that:

  • First create a Git repository (with git init)
  • Export all your notes to it in Markdown format
  • Commit the changes (git add -A && git commit -m 'init'
  • Then export the second set of notes to it

Then using something like Sublime Merge you would directly see everything that has changed, and you can choose to stage the changes you want to keep or discard the ones you don't.

Otherwise, couldn't you just keep the most recent version of each note? Isn't it the best version in most cases? Then next to it you can maybe keep a backup of both sets so if one day you realise that something's missing you have it there.

1 Like

I will try to implement the git merging approach (I've to learn some git stuffs first).

For the other suggestion based on modification time, I made unfortunately some unfortunate manipulations like exporting as markdown and importing again which make me feel that it is a bit a risky approach. I think modification date is not anymore related to editing of the notes in those notebooks.

In fact many notes have the same exact content, except the two first lines. The first line includes a date with a fixed format. The second line is empty. Maybe I could identify note/file with this characteristic, remove those lines and find identical notes after this modification, merge them. The problem is that I will lose attachements of notes that have attachements. Right? When exporting as markdown, we lose attachements?
In order to analyze the other notes I'm thinking of using python because I have some habilities with it. I think I am too weak in git, javascript and typescript. I would need quite a lot of time too learn enough in order to solve this problem.

Am I the only one having this kind of problems?

The only languages that I could use for "scripting" joplin is javascript or typescript, right?

I found that I could strip out notes with attachment by searching something like that notebook:Google_Keep_desktop resource:* and treat tese notes manually in order not to lose the attachement. I found 339 notes with attachments in one notebook and 340 in the other one. It's getting "fun"...