Parameters for the new encryption

Current Implementation

According to the technical specifications and the code, When encryption is enabled, the following steps are performed:

  1. User inputs password which is stored in the local SQLite database.
  2. A random 256-byte master key (or shorter) is generated.
  3. The master key is encrypted using the AES-256-CCM encryption method. The encryption key for the master key is derived from the password using the PBKDF2 function (source1, source2). You can find the encrypted master key in the info.json in your sync target.
  4. Notes and resources are encrypted using the AES-128-CCM encryption method. The data is not directly encrypted by the master key. We use the master key and a random salt to generate the actual encryption key with PBKDF2 function, just like how we generate the encryption key for encrypting master key.

Notes:

  1. The PBKDF2 implementation in sjcl is slow. However, sjcl cached the key derivation output with the same parameters (password, salt, iteration count), so reusing the same derivation parameters for multiple times is faster than expected.
  2. The salt of the data encryption key changes when Joplin restarts. This brings some resistance of reused nonce vulnerability because the generated data encryption key changes. However, we can set a maximum size threshold to reset the salt forcefully in order to increase the security.
  3. The chunk size is 5k bytes due to the slowness of sjcl. Considering we could get a much higher encryption speed (base64 encoding/decoding is not included) by switching to native encryption library, we can set the chunk size to a higher value like 16k or 64k
  4. The size overhead should be around 33.3% due to the base64 encoding. However, when checking the ciphertext on the sync target, I found the size overhead is more than 77.7%. It looks like the base64 encoding is performed twice. I'm now finding the reason of it.

Future Considerations

The key derivation function: PBKDF2

  • PBKDF2 is widely supported so I choose it. This is also what sjcl uses so the security of this part won't decrease. There are a few better key derivation functions like scrypt and Argon2 but they are not available in node:crypto/react-native-quick-crypto
  • Increase the key iteration count of encrypting master key from 10000 to 210000 (suggested by OWASP). The native implementation of PBKDF2 is around 200 times faster than sjcl, with this key iteration count we can get higher security and faster speed.
  • Change the salt length from 64 bit to 128 bit (or higher). 64 bit is the minimum requirement of PBKDF2 and we'd like to be higher than the boundary. Also the 128 bit salt length is widely used now.
  • Change the digest algorithm from SHA-256 to SHA-512 for longer key length. The extra part could be used in the future.
  • scrypt might be a better choice once it's available in react-native-quick-crypto.

The cipher and the mode: AES-GCM

  • AES-GCM is widely supported so I choose it. There are a few better ciphers/modes but they are not available in node:crypto/react-native-quick-crypto.
  • Change the key size from 128 bits to 256 bits. We've tried it before but reverted it due to the performance issues. Now with the powerful native implementation we are good to use it.
  • Change the IV size to 12 bytes (96 bits). This is recommended for the AES-GCM mode. We could use a longer IV but for AES-GCM it will be hashed to 96 bits finally. As for the potential nonce(IV) collision, we use the derived key from master key with a random salt to mitigate it.
    • The default IV size in sjcl is 16 bytes, but for the old CCM mode only the first 13 bytes are used. We use 12 bytes for GCM mode so it's easier to get collision (though in a small possibility) but as I noted before we will use a longer and frequently refreshed salt to mitigate this.
  • Change the authentication tag size from 64 bits to 128 bits (suggested in this paper).
  • As far as I know the known defects about AES-GCM are only about leaking the authentication key, which doesn't matter for Joplin.

Please feel free to discuss the old/new encryption parameters.

3 Likes

AES-GCM and PBKDF2 both seem to be supported by the web crypto.subtle API and, as such, should also both be usable in the web port of Joplin mobile.

Edit: It looks like crypto.subtle is also supported natively in NodeJS. Would it make sense to use this API also for the CLI and desktop apps?

2 Likes

Is there any reason to use crypto.subtle rather than node:crypto? I have tested the latter and it works fine.

Unlike node:crypto, crypto.subtle should work in a web browser. As such, crypto.subtle might allow using the same encryption/decryption logic on desktop, CLI, and web.

Edit: Additional notes:

I'm not familiar with these various libraries, but one thing to note is that we don't have to use crypto.subtle for the native encryption - it's just that ideally we should check that whatever library configuration we use can be compatible with crypto.subtle.

It's the same situation we have now for RSA - we use two different libraries, one for mobile, one for desktop, but both can decrypt what the other library encrypts.

I would assume that as long as we use standard, widely used algorithms, there's a good chance they are supported by crypto.subtle too.

1 Like

I think it's not a good idea to use crypto-broweserify because it's a pure JS implementation and could be very slow.

By the way, I wonder if the web support is on the roadmap of Joplin. If so, I will do some compatibility tests for the selected parameters on node:crypto/react-native-quick-crypto/Web Crypto API

The source of the extra size overhead

The file content is actually base64 encoded twice during the encryption flow.

  1. When reading the file data, the result is a base64 encoded string. (source code)
  2. When encrypting, The sjcl.json.encrypt() will treat the plainText as a normal UTF-8 string rather than a bunch of base64 encoded data. After the encryption, the ciphertext is base64 encoded. (source code)

To reduce the unnecessary size overflow, we could convert the base64 string from Step 1 to the bitArray first, then encrypt it with sjcl.json.encrypt(). However, this introduces an extra step and slows down the encryption process.

1 Like

I think so — the proof of concept is going well and seems to support most of what the mobile client does (though work is still needed to make it easier to use on a desktop computer).

1 Like