Synchronization overwritten my changes, no conflicts, no history

Operating system

Android

Joplin version

3.5.8

Sync target

Dropbox

Editor

Markdown Editor

What issue do you have?

I edited a note. It was old version (not yet synchronized).

I went back to note list/other note and then went back to edited note.

My changes were gone :frowning:
The note was updated to latest version - synchronized from Dropbox.

I don't see my edits in note history (in desktop app).

I don't see "conflicts" notebook.

Is this how it's supposed to work? Known problem? Bug? Does it happen often?

I actually switched from OneNote because of synchronization problem, but OneNote had frequent conflicts and synchronization was slow, but it didn't loose data :frowning:

I think it was likely caused by the issue detailed here All: Fixes #13611: Fix missing conflict scenario by mrjo118 · Pull Request #13624 · laurent22/joplin · GitHub

The good news is this will be fixed when Joplin 3.6 is released. There are pre-releases available if you want the fix now and don't mind using beta versions. Otherwise there is expected to be a release within the next month

I hope so but I'm not sure. The scenario described there is much more complex than what I did. Also it says "unless the user notices the missing changes and checks for them in the note history." but my history has no record of my change.

By the way is there a documentation of the conflict detection algorithm?

Because I don't understand why is it checking update time in the pull request

No there isn't documentation for the conflict mechanism.

Although the detailed scenario sounds complicated, unless you always check everything is synced before every time you quit Joplin (or put it in the background / turn off your screen on mobile), it's likely this issue could happen at some point if you regularly switch between devices for editing the same note. It doesn't require that you have Joplin open on both devices at the same time for this to happen.

From the pr:

This PR resolves the issue, by setting the sync_time to the remote item updated_time when the sync downloads a remote change (in the delta step and in conflict resolution), similar to when the sync uploads changes, which sets sync_time to the local item updated_time. However, as the remote item updated_time will originate from the local time of other devices in most cases, the setting of the sync_time also caps the value at the current device time. This is to prevent potential sync issues, in cases where there is time drift between devices.

I wonder how are the timestamps used. Are they the only way of verifying if there's a conflict? I'd expect a hash of previously synced document, current version, and remote version to be compared. Or is it an optimization against having to download whole file to get the hash?

Actually Dropbox seems to offer content hash without downloading whole file: Content Hash - Developers - Dropbox

I did submit another PR to to avoid using timestamps for determining conflicts, but that got rejected at the time, because it's a significant change and it conflicts with some other work which was planned, though I don't know when that is going to happen.

To be honest though, I think local time drift is going to have a minimal impact on conflict detection not working correctly. If you're going to edit the same note on a different device immediately after changing it, then it's kind of logical that you should check that it has synced first, otherwise you're just asking to get a conflict (and that's when you're assuming it would work correctly)

I think local time drift is going to have a minimal impact on conflict detection not working correctly

I think I agree. At least in my case the file on remote was edited a long time ago, as far as I remember, so it was not a problem with time drift.

But I think it depends on how you use the timestamp. I think it should work if you use it as "hash". But I think currently it's used to see which edit was done later? If that's true, I don't understand how it works, and why is it done?

The PR comment says the upload step of sync relies on a comparison of if (remoteContent.updated_time > local.sync_time)

The way I imagine it could be working is:

  1. keep following information:
    • remoteFile.updated_time - timestamp for the file on server
    • localFile.updated_time - timestamp for the file locally
    • lastSync_updated_time - timestamp for the file when it was last synced
  2. when:
    • localFile.updated_time != lastSync_updated_time - file was edited locally after last sync (local edit)
    • remoteFile.updated_time != lastSync_updated_time - file was edited remotely after last sync (remote edit)
  3. conflict detection:
    • if (localFile.updated_time == lastSync_updated_time && lastSync_updated_time != lastSync_updated_time) - no conflict, no local edit, remote has changed - replace local with remote
    • if (remoteFile.updated_time == lastSync_updated_time && localFile.updated_time != lastSync_updated_time)- no conflict, local edit, remote has not changed - replace remote with local
    • if (remoteFile.updated_time != lastSync_updated_time && localFile.updated_time != lastSync_updated_time)- conflict, both local and remote edit

So conflict is only when both remote and local files has changed (the history has diverged). There's no need to check which edit was made later IMO.

I think it should work if you use it as "hash". But I think currently it's used to see which edit was done later?

I think the answer in a sense is it uses timestamp both as a hash and to see which was done later. If you take time drift out of the equation and assumed every device had exactly the same local time, it would effectively work like a hash comparison. The problem is prior to my change in Joplin 3.6, there was effectively a logic error for that implementation.

The way I imagine it could be working

FYI you do need to store some kind of actual hash to address the time drift issue, because the timestamps for files on the server won't be the same if you use file system sync with an external sync service, which means there is essentially no single source of truth for Joplin to read from

Are the timestamps the actual timestamps provided by filesystem/dropbox/etc for the actual file? Or metadata created by Joplin when file is edited and associated with the file?

The timestamps are created from the device time. But as mentioned, you can't rely on the server time for file system sync anyway, and its not practical to use different implementations for different targets.

Regardless, with the latest fixes the conflict detection should be reliable, except if there is time drift and you make changes to the same note on different devices, within the window of the time drift. But I don't think that's likely to be a problem for most users in practise, due to automatic time syncing on OS' keeping the drift minimal