Request: Implementing MD5 Checksum Into The Joplin Build Process

After a user had an error come up due to their file not being fully downloaded properly using the Joplin_install_and_update.sh script, I took a look at it and noticed there was no way to check if the download was properly downloaded correctly. MD5 Checksums are a quick and easy way to verify a download without having to add too much to a project and are pretty standard across the board.

1 Like

We release blockmap files which I believe contain checksums. Releasing an md5 seems easy but I honestly don’t know how to integrate that in CI.

I’m not familiar with Blockmaps. If they do contain checksums, if there’s a way to verify those in the script above and possibly have them available on the Releases page on Github so users can verify their downloads, that would definitely work for this.

I just looked into the blockmaps a bit. Looks like they contain checksums for each individual file contained within the archive. So I don’t think this is exactly what we want, it makes more sense to calculate the md5sum of the appimage file directly (and dmg/exe).

blockmap reference

Maybe this would be a good first issue?

2 Likes

I would love to work on it :smiley:

1 Like

TBH I am not a fan of md5 checksums. sha-2 is ok, but gpg is best. But as mentioned in another thread, I’m not sure how to gpg sign packages in the CI pipeline.

1 Like

I don’t mind adding this if it’s not too complicated. As for checksum, I agree that MD5 is not good for this as it’s not secure, so we can use whatever checksum is considered secure these days, maybe sha256?

1 Like

Yes, this works. md5 and sha1 - no.

1 Like

It seems to me that there are two different concepts being discussed here; checksums and signatures (or integrity and authenticity).

@bedwardly-down 's initial comment (above) was regarding providing some kind of checksum file as an integrity check so that a user can check the install package against a checksum just to ensure that the file downloaded correctly. All you have is an additional file that can be downloaded with contents something like 2e78e3596d82df90f5b554172495955f6d807ae2 *Joplin-Setup-1.0.201.exe that can be compared against what was downloaded. Nothing about this has any kind of authenticity element. It is an integrity check only (i.e. check the file is complete), a trouble-shooting aid.

However the other thread that @tessus refers to concerns something different which is GPG signing install packages to ensure authenticity.

MD5 and SHA1 are certainly not good for encryption but they are more than adequate as a simple checksum for just helping to confirm a download came through intact. Even today MD5 and SHA1 checksums are used for integrity checks when creating forensic disk images.

If someone is going to come up with a malware laden version of Joplin they are not going to try to also engineer a modified version that somehow matches the MD5/SHA1 of the original. If they can compromise an install package and get it into Joplin GitHub releases it would be easier just to replace the MD5/SHA1 checksum file as well. It doesn’t matter whether you use MD5, SHA1, SHA256, SHA512 or whatever because none of them provide any authenticity.

This is where signatures using something like GPG come in. These provide the same function (to confirm that a file is intact) but they also provide authenticity as private keys are required to actually generate the signatures. To be done properly the full-on air-gapped, multi-part GPG signing route requires some very serious effort.

What option is chosen depends on what you want / need to achieve. If the software is a prime target for compromise then GPG signing (along with proper implementation protocols) is a good option. However if you just want a user to be able to see if their telco’s unshielded three strands of cracked copper wire into their house has flipped a few bits in the download, then a simple MD5/SHA1 checksum will suffice.

3 Likes

I agree with everything you said.

However, collisions are possible with md5 and sha1 (I’m not talking about trying to match a compromised packages hash to the original.) . And yes, I know that the chances are almost zero.

Personally I don’t care about integrity check files, since I always sign my code with gpg, which does not only guarantee authenticity but also integrity (depending on the gpg options).

1 Like

By the way both macOS and Windows versions are signed so that would prevent the executables from being tempered with. In that case I don’t think a checksum is necessary for these.

It would be for the AppImage though, but then I wonder if it’s possible instead to sign an AppImage package?

For the AppImage, an MD5 Checksum should be more than enough. It’s common practice for Linux software and the only real reason to use a signature key (like gpg keyring) would be if Joplin gets officially distributed through the various different package managers, and I don’t really see that happening.

Also, @dpoulton, you’re very correct with these being two different topics. I don’t use checksums or keyrings all that often in my own personal uses, so i was under the impression they were similar. I learned something new.

@laurent, can you please publish the SHA-256 checksum of the Appimage?
Then users can directly verify the integrity of their download.
And I will open a PR to update the install script.

Publishing it, why not, but it would need to be entirely automated. If you can figure out how to make that part CI then I can merge.

Okay. Will give it a shot.

1 Like

Am I right about this : the Travis CI is used for building and running tests but not for directly making the releases (because I did not find a deploy field in the .travis.yml file)?
If so, why is the deploy also not done by Travis?

Yes it’s done by Travis for Linux/macOS and Appveyor for Windows, in both cases by Electron Builder: https://github.com/laurent22/joplin/blob/5143870d3b3132d15e6d8ecc83b284f1d36a4ce2/.travis.yml#L115

PR made for this: https://github.com/laurent22/joplin/pull/3410