New version of Joplin contacting Google servers on startup

Hmm, e.g. I don't have the English (Canada) dictionary on my disk, b/c I never downloaded it (blocked the connection). And when I turn off spellcheck and restart Joplin, there's no connection request to download the dictionary.

I don't think it is ever unfair to ask serious questions, however unwelcome. I do also NOT assume that there is any wrong doing or thinking on the devs side. BUT there could be some implicit negligence when binding / using third party libraries, or server requests - in particular when this happens out of the blue in order to add new features like a dictionary lookup. The more "services" you use the higher the chance one of these free-of-charge dictionaries, or other, will charge you in an unexpected ways.

High horses ? I am not Bruce Schneier, and when I read once more my three lines quoted, I cannot see any wrong doing, ill intent or unfair questions.

So back to square one, is it really worth having Joplin as of lately contact 7 different servers, for 19 different purposes, while it "seems" obvious that Joplin could happily live with one cloud storage alone and do a perfect job ?

The dev does, what he wants to do - it is his project and hobby horse. The board is allowed to raise concerns.

Otherwise as I said many times before : Joplin is a great application, I do hope that it will eventually make money for the developers, like Skype managed to do after many years of sweating.

Back to the main subject because I may have missed something. Just checked my logs a moment ago. Today the Joplin helper alone has contacted 7 sites, none of which I wanted or needed, among them for example cwl.cc
Me, I turn this off using an outgoing firewall. But short of this, how can a layman user turn of the cwl.cc requests (or any other) - within Joplin ?

Dear ajay, I cannot write you a pm so I have to do it publicly. Sorry for that.

I get that you might be upset or frustrated, things get in the way and you might feel especially bad about those background network calls. I get it, stuff happens. But please take extra care of yourself before coming here.

Without a cool mind, engaging in fights with the founder and 90%+ code contributor of the project, might just get you a warning. And that would not have anything to do with the questions you're raising.

What I want to say here is Please be mindful with the tone of your posts. I understand the questions might be serious for you. Other people already tackling them, so there's no need for such a strong persuasion.

I imagine your post might have ruffle some feathers, but in order to mitigate at least some damage, how about we remove this post for now and once you feel well enough, you can try write another one? So far, I have to flag it as an inappropriate one.

We're all just people here. None of us are payed to be here or obligated by a contract to do a specific thing. We all have feelings. And if you can lend me one favour, could you kindly respect that?

4 Likes

Just thought I'd chime in here as someone who casually enjoys reading the threads that are highlighted in the email summary - while I am not adverse to limiting the data I send to Google, etc, that's not why I came to Joplin. I like the interface, I like the markdown, I like the syncing with my phone via Dropbox, I like that my note data lives on my device, and I like that it is free. The fact that it is open source and has a great developer community with good communication is a bonus. So these are the reasons I would and do recommend it to others. :slight_smile:

8 Likes

Was this test of your MacOS install or the test Linux VM? Was the Joplin Electron instance already initialized with the .config/Joplin folder?

I did a bunch of additional testing. It seems the key to this issue is the presence of the dictionary file (/home/user/.config/Joplin/Dictionaries/en-US-9-0.bdic). If it exists, Joplin will not contact Google to grab it. If it does not, it will.

Testing:

  1. Remove the Joplin config folder (/home/user/.config/Joplin/).
    This is an important step because it will force Electron to initialize the client on next startup without any cookies or other saved configurations.
  2. Block redirector.gvt1.com
  3. Start Joplin:
    o observe that it pings redirector.gvt1.com.
    o observe that it creates a fresh /home/user/.config/Joplin/ folder.
    o observe that /home/user/.config/Joplin/Dictionaries is empty
  4. Close Joplin and open it again.
    o observe that it pings redirector.gvt1.com.
  5. Remove the block to redirector.gvt1.com.
  6. Start Joplin
    o observe that Joplin has contacted the Google server and downloaded the dictionary file en-US-9-0.bdic
  7. Start Joplin
    o observe that it does not contact redirector.gvt1.com because that dictionary file already exists.
  8. Delete the dictionary file en-US-9-0.bdic
  9. Start Joplin
    o observe that it tries to contacts redirector.gvt1.com.
  10. Delete the dictionary file
  11. Manually copy the dictionary file en-US-9-0.bdic
  12. Start Joplin
    o Observe that it does not contact redirector.gvt1.com
  13. Start Joplin, turn spell checker off, Close Joplin
  14. Delete en-US-9-0.bdic
  15. Start Joplin
    o observe that it still tries to contact redirector.gvt1.com.
  16. Stop Joplin.
  17. Manually copy the dictionary file en-US-9-0.bdic
  18. Start Joplin
    o observe that it does not contact redirector.gvt1.com.

If the dictionary file does not exist, Joplin will try to go out and grab it. If it does exist Joplin will not contact the Google servers.

So what does this mean?

  1. The Joplin spellchecker feature is turned on by default however, at least on Linux, Joplin does not ship with a dictionary file needed for this feature. As long as that file does not exist, Joplin will continue to contact the Google server to grab it.
  2. From looking at the contents of the Joplin config folder, it appears that Electron (Joplin) does indeed set up transport security and persistent network connections to the Google servers likely including a unique client fingerprint ID which can be used for tracking. This data is indeed sent over port 443 (https) so it is encrypted and the user cannot easily know what data is being sent.

Possible developer solutions:

  1. Allow users to opt-in to the spell checker feature (turn the spell checker feature off by default)
  2. Ship a default dictionary?

Work around user solutions

The only way I currently see to use Joplin without contacting Google servers during that first initialization is to:

  1. Block redirector.gvt1.com
  2. Start Joplin, stop Joplin
  3. Manually copy the dictionary file from a previous installation en-US-9-0.bdic

After this Joplin seems happy and does not contact any Google servers. It may try to contact them again when a new dictionary is available however the firewall rule will prevent this.

This approach puts a heavy burden on the user to know exactly how to configure Joplin and requires the use of 3rd party software just to enjoy a key benefit of using Joplin that they had previously.

Personally I would like to see an opt-in approach to plugins. I use Joplin as a note-taking application. I am not doing word processing or drafting letters in Joplin so I have never needed a spell checker. I do like having the options for one and I do see how others would find this useful.

I do not use Joplin simply because it is free and I do care about my data and digital footprint. I use Joplin because it solves real world problems as a secure and private open source note taking application. I read the privacy policy (which is prominently displayed on third party websites such as Notebooks - Privacy Guides and I appreciate the vibrant and active developer and user community.

An opt-in approach to features that have privacy implications might be a nice compromise as it gives each user a choice and allows them to make an informed decision. Previously, as user could choose to use Joplin and choose not to use Google services and I would like to preserve that choice.

1 Like

Thanks ioojoplin for your testing and Daeraxa for your link. I like to have a spell checker feature.

However the documentation about the spellchecker is very short to read and clear. I would say first it’s an electron default, but I wonder :

Could Joplin developer ignore when enabling the spellchecker, it uses Google services and the way to avoid this ?

Is it possible to switch to open source spellcheckers with corresponding dictionaries?
For the look of it, Open Office etc seem to use them also.
nlp - Open source spell check - Stack Overflow

Dont know the impact to develop integration with them tho.
Or the priorisation in the grand scheme of things.

Open Office uses Hunspell, which is the same checker that Joplin uses :slight_smile: Google just provides the CDN that the dictionaries are pulled from. It's possible to use an alternative source, but it's not clear if anyone else actually hosts these dictionaries for the same use.

1 Like

Ah I get it, thnx.
Hosting them on Joplin Cloud could lead to unforeseen bandwith costs also?

Hmmm, compared to the executable installer its peanuts maybe?
Dont know if dictionaires update frequently tho.

I was going to start a new thread to ask what the freegeoip.app connection is that i noticed since Joplin 2.6.9 (prod, darwin) and here i am in a very active thread.
Little Snitch shows this list when opening Joplin:

With the blocked ones Joplin still works nicely.

The freegeoip.app is the one that i find strange!
Searched again on the www:

What is freegeoip.app, is a JSON-based IP lookup REST API that enables developers to retrieve data about an IP Address. How many requests can I make? Our IP geolocation API provide you with a generous request volume. We allow you to make 15 000 requests per hour via your apikey. Are the data up-to-date?

I doesn't like what there is written and i have put in bold, good that i have blocked it from the beginning. But maybe i am wrong?

freegeoip is used if you have enabled saving location data with your notes (I think it's on by default). Disable it and it should go away.

Thanks, then all seams quite normal and used pure for functionallity. :wink:

I wonder where these Discourse requests come from though, unless it's from when you clicked on the forum link from the app.

No they only pop-up in LSnitch on Joplin startup. Right now i have done a new test.

Maybe a note with a link to a hosted image?

I am also using LS and I can't see any requests to discourse.

@laurent, maybe we should list the domains in the privacy.md file. We already do that for the spellchecker.

Update: It looks like the privacy.md file is not used in the documentation. The README.md itself has a privacy section where half of the info is missing. Somehow there's a bit of a mixup, duplicated but inconsistent info.

I really think everyone thinks that they are the average user. In my opinion, having the app go somewhere looking for a dictionary file is only a privacy concern if you are somehow trying to hide from any network connections. This is a far cry from Google mining our emails to show us ads. IMO, anyone concerned with this level of "privacy" should be protecting themselves through more reliable means.

But by all means if the privacy statement can be tweaked so people don't get a false sense of security, then that's great.

Signed, an "average" user :grin:

5 Likes

If the results posted recently by ioojoplin are robust across platforms, and I actually understood those results, couldn't we avoid the problem by simply supplying an empty dictionary file with the default installation, and then if the user turns the spellchecker on, warn the user about privacy issues, then delete the empty dictionary, and allow electron to grab the real dictionary? Sorry if I missed something important that messes up my strategy for avoiding problems.

By the way, I really like joplin, I have used it every day for a couple years now on 3 or 4 devices, I care about the privacy issues, and I donate monthly. My thanks to all of you!

1 Like