Manage a huge (30GB) note collection - personal vs non personal data

Hello world,

I am looking for some advice as a new Joplin user migrating from Evernote with a huge database (30GB, with about 30,000 notes).

It's been less than a week, but I love Joplin so far. Thank you so much for that great tool !
I'm currently using a OneDrive E2EE sync with PCs on Win11 and MacOS, and phone/tablets on iOS and Android for trying it out. I plan to buy Joplin Cloud soon.

However, I am afraid my big collection of notes - and the workflow I am used to after 12 years with Evernote, will create problems.
From this poll on Joplin collection size I see I'm on the higher end in terms of storage needs. From several posts 1 2 , I see the phones (iOS / Android) tend to lag with larger collections.

And that should be expected, since all Joplin notes are synch'd every time I open the app - on a mobile, with bandwidth / CPU / memory limitations that can definitely create lag.

In addition, encryption seem quite costly in terms of local storage : from what I have seen, each resource file, when downloaded locally, is stored as both an encrypted and unencrypted copy, and the encrypted copy is about 2.5 times larger than the original file.

What is my use case?
For 10 years+ I have used Evernote as an easy dump for everything I ever imagined I could need to keep. With virtually no limitations on storage, I would save all my personal files both on my OneDrive/GoogleDrive and as a note in Evernote (redundancy), and I would copy also any "public/ semi-public" document (news article, research report, wikis...) in Evernote (both to find them back more easily, and also because web pages tend to change without warning, so saving only a URL is useless).

In my use case, a big chunk of my note collection (probably 2/3rds to 3/4ths) are those pages and documents copied from Internet. They do not really require encryption. They do not need to be downloaded every time to my phones: Evernote's feature of downloading them only when requested from a search was enough (unfortunately quite buggy).
But the rest of my notes do need both encryption (it's personal, and sometimes really confidential) and sync'd (I would need to access some of them easily from any of my mobile/tablet).

I understand Joplin is not designed for a similar large-scale "data dump" approach.
So how can I adapt my workflow?

Should I separate my note collection in 2, with all the "public/semi-public, non critical" stuff on a separate Joplin account syncd with OneDrive without encryption, and all my personal stuff on Joplin Cloud with E2EE?
In this solution I sync my phones only with my personal stuff, the non-critical stuff is no more accessible by phone (except in case of emergency). The storage of my personal stuff would become redundant again (on Joplin Cloud and manually saved in my OneDrive).
Then I lose the ability to search my stuff in a single place (e.g. : today, I tag with the same project name the relevant web articles and tutorials, and the notes on my work, so I can find both in a single easy step).

It doesn't seem possible to segregate the notes in a similar fashion by notebook, does it ? Having one notebook sync'd with the phones and not the other ? Maybe one notebook encrypted and not the other?

Thanks for your ideas and feedback

One option might be changing "attachment download behavior" from "Always" to "Automatic" before syncing a mobile device.

Doing this causes the device to only download resources after opening a note they're attached to. (But should have no effect on already-downloaded resources).

See also Feature suggestion: Mobile: Change default attachment download behavior from "Always" to "Automatic"

1 Like

Thank you. I did take not of this and selected "automatic" for the attachment download.

However my main concern is about the number of notes themselves, which seem to have an impact on time to sync and to open/ search/ edit on the mobile apps.

The other, hopefully minor, concern is that the mobile app seem to keep indefinitely the attachments that have been automatically downloaded (and keep 2 copies of each, the encrypted and the unencrypted).

Privacy

If your main concern is privacy regarding your phone, then consider this. I have everything encrypted, probably like you. LUKS on my Linux machines, my rooted phone is encrypted while running LineageOS, and I keep nothing on my phone aside from a calendar, and a few apps for creature comforts. This is because encryption only benefits you if your phone doesn't have the encrypted drive open. If you turn off your phone, and turn it back on, everything remains encrypted, but once you unlock the phone, nothing remains encrypted, unless you have another encrypted file/database (e.g. KeePass) stored on your phone.

My suggestion is to have nothing on your phone at all, or at least very minimal information, as it's the easiest to get lost or stolen.

Performance [Self-Hosted]

I self-host Joplin Sync Server on my server at home, and it has fast performance. Searching for notes is quick, and my resource folder is around 30gb with over 8,000 notes. So, if you get your own server, you can mitigate costs and improve performance. All you'd need is a used gaming PC with a decent CPU and RAM, then install a Linux distribution, then install docker and docker-compose, and then use the docker-compose.yml file found in the Joplin GitHub repo to spin up your Joplin Sync Server. Then you can use a simple Caddy web server and a domain name to make your own syncing service. It takes only a few moments if you have a good guide. With that method, you can have RAID discs, encrypted, all of your information is stored at home under lock and key, and you can also do this all on your local network instead of the internet if you want greater security.

You should avoid using paid services like OneDrive, Google Drive, and other stuff because, well, you get a lot more bang for your buck with a moderate amount of effort. Here's a bullet point guide on what you'd need to do:

  • Get a domain name from NameCheap or somewhere, I like NameCheap because it works with DDClient
  • Get a used computer and new hard drives to use as a server
  • Install a Linux distro
  • Install Docker, Docker-Compose, DDclient, Caddy, UFW (uncomplicated firewall), TLDR (program to show popular commands for software)
  • Go to the Joplin GitHub Repo and locate the docker-compose.yml for the Joplin Server.
  • Make a directory on your Linux machine in /etc/ called /joplin-server-docker/ copy the docker.compose.yml contents into a file named docker-compose.yml in your /etc/joplin-server-docker/ directory
  • Edit the passwords in the docker compose file, below is an example:
version: '3'

services:
  postgres:
    image: postgres:15
    expose:
      - 5430
    ports:
      - 5430:5432
    environment:
      - POSTGRES_PASSWORD=ABCDEFGHIJKLMNOPQRSTUVQXYZ
    volumes:
      - /path/to/your/big/hard-drive/joplin/postgres:/var/lib/postgresql/data
    restart: unless-stopped

  joplin:
    image: joplin/server:latest
    volumes:
        - /path/to/your/big/hard-drive/joplin/data:/data
    depends_on:
      - postgres
    ports:
      - 22300:22300
    environment:
      - APP_BASE_URL=https://your-website.com/
      - APP_PORT=22300
      - DB_CLIENT=pg
      - POSTGRES_HOST=postgres
      - POSTGRES_USER=joplinpostgres
      - POSTGRES_PASSWORD=ABCDEFGHIJKLMNOPQRSTUVQXYZ
      - POSTGRES_DATABASE=postgres
      - POSTGRES_PORT=5432
    restart: unless-stopped

  • Start UFW, then allow port 80, and 443 to be accessed. ufw allow 80 and ufw allow 443
  • For allowing local machines to access your server, follow this format:
    ufw allow from 192.168.0.0/24 to any port 22300 comment "software_name"
  • You can see your changes with ufw status numbered
  • Go to your router firewall and allow ports 80 and 443 to be accessed to your Linux server's IP (ip addr command will show you what your machine's local IP is).
  • Go to /etc/caddy/CaddyFile and paste this:
  • This takes your local Joplin Server and exposes it to the internet under the name of your domain name that you bought.
admin off
}

your_website_name.com {
        reverse_proxy localhost:22300
}

  • Your residental IP address may change, so you can use DDClient as an IP address updater that can send a message to NameCheap to update your IP. An example config is below from /etc/ddclient/ddclient.conf:
  • Every time you run the command ddclient it will refresh your ip address with NameCheap
use=web, web=dynamicdns.park-your-domain.com/getip
protocol=namecheap
server=dynamicdns.park-your-domain.com
login=[your website name]
password=[long password autogenerated on NameCheap admin Dashboard]
*, @
  • After all those steps, you are on your way to being your own system administrator. There are other projects you can self-host to make your life cheaper, and more private, you can see a large list on Awesome Selfhosted found on GitHub.
1 Like

Maybe the best is to separate your notes into two vaults: those need to be synced and those don't. Maybe losing the ability to search in a single place is acceptable. If you don't need those note on your phone, there is also no need for search function to find them. But then on your desktop computer switching between two vaults adds some inconvenience.

1 Like

Many thanks for the tutorial.
I'm using OneDrive temporarily for now. I plan to pay for Joplin cloud, both because it will provide some redundancy and because I'll be happy to give some money to support that wonderfull software.

However I love this part of your answer :

So, if you get your own server, you can mitigate costs and improve performance. All you'd need is
... (proceeds with a 3-page description of the work to be done)

Unfortunately, although I would enjoy spending hours doing what you described, I don't really have the time and experience (in other words, my experience is limited so it will take me much longer implementing it).
Also my previous experience in having a small (Raspberry-Pi-based) server in my home is that it requires more maintenance than I can afford. (eg that time the cleaning lady at my place unplugged the box and the server to vacuum the room while I was working for a client 6000 kilometers from there... And the damn server did not restart properly).