High disk read of joplin-server

jade12 · 19 September 2024 10:41

Operating system

Linux

Joplin version

2.14.2

What issue do you have?

I'm running joplin server in a docker. It constantly read file at 70~100MB/s, which can be confirmed by htop, iotop and docker stats.

Currently I only found this error

joplin-server  | 09:31:51 0|app  | 2024-09-19 09:31:43: [error] TaskService: Failed running task #12 (Process emails) KnexTimeoutError: Knex: Timeout acquiring a connection. The pool is probably full
. Are you missing a .transacting(trx) call?
joplin-server  | 09:31:51 0|app  |     at Object.queryBuilder (/home/joplin/packages/server/node_modules/knex/lib/knex-builder/make-knex.js:112:26)
joplin-server  | 09:31:51 0|app  |     at createQueryBuilder (/home/joplin/packages/server/node_modules/knex/lib/knex-builder/make-knex.js:320:26)
joplin-server  | 09:31:51 0|app  |     at TaskStateModel.knex [as db_] (/home/joplin/packages/server/node_modules/knex/lib/knex-builder/make-knex.js:101:12)
joplin-server  | 09:31:51 0|app  |     at TaskStateModel.<anonymous> (/home/joplin/packages/server/src/models/TaskStateModel.ts:16:15)
joplin-server  | 09:31:51 0|app  |     at Generator.next (<anonymous>)
joplin-server  | 09:31:51 0|app  |     at /home/joplin/packages/server/dist/models/TaskStateModel.js:8:71 {
joplin-server  | 09:31:51 0|app  |   sql: undefined,
joplin-server  | 09:31:51 0|app  |   bindings: undefined,
joplin-server  | 09:31:51 0|app  |   originalStack: 'KnexTimeoutError: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?\n' +
joplin-server  | 09:31:51 0|app  |     '    at Client_SQLite3.acquireConnection (/home/joplin/packages/server/node_modules/knex/lib/client.js:332:26)\n' +
joplin-server  | 09:31:51 0|app  |     '    at runNextTicks (node:internal/process/task_queues:60:5)\n' +
joplin-server  | 09:31:51 0|app  |     '    at processTimers (node:internal/timers:509:9)\n' +
joplin-server  | 09:31:51 0|app  |     '    at Runner.ensureConnection (/home/joplin/packages/server/node_modules/knex/lib/execution/runner.js:305:28)\n' +
joplin-server  | 09:31:51 0|app  |     '    at Runner.run (/home/joplin/packages/server/node_modules/knex/lib/execution/runner.js:30:19)'
joplin-server  | 09:31:51 0|app  | }

And later I saw this consistently in the log: 2024-09-19 10:30:00: [warn] TaskService: Tried to start #12 (Process emails) but it was already running

Note: I saw a similar error/exception for #11 (Process shared items), but followup runs were successful.

I also verified that there are enough free memory, so this is probaby not about swap.

I'm still investigating, and I'll try v3 later.

Questions:

Is this a known issue?
What is the easist way of disabling task #12?
How can I check which file is being read? strace showed something about sqlite journals, but I think it's for #11

jade12 · 19 September 2024 10:51

About swap, I only verified the system-wide free memory. Perhaps it is possible that joplin server is using all memory within the docker limit? Not sure how I can check that.

jade12 · 19 September 2024 20:26

~~I investigated further. It seems to be a general problem of docker. I'm closing it for now.~~
Continuing invetigation.

I observed that task #11 is constantly failing:

joplin-server  | 08:31:16 0|app    | 2024-09-20 08:31:08: [error] TaskService: Failed running task #11 (Process shared items) KnexTimeoutError: Knex: Timeout acquiring a connection. The pool is proba
bly full. Are you missing a .transacting(trx) call?
joplin-server  | 08:31:16 0|app    |     at Object.queryBuilder (/home/joplin/packages/server/node_modules/knex/lib/knex-builder/make-knex.js:112:26)
joplin-server  | 08:31:16 0|app    |     at createQueryBuilder (/home/joplin/packages/server/node_modules/knex/lib/knex-builder/make-knex.js:320:26)
joplin-server  | 08:31:16 0|app    |     at TaskStateModel.knex [as db_] (/home/joplin/packages/server/node_modules/knex/lib/knex-builder/make-knex.js:101:12)
joplin-server  | 08:31:16 0|app    |     at TaskStateModel.<anonymous> (/home/joplin/packages/server/src/models/TaskStateModel.ts:16:15)
joplin-server  | 08:31:16 0|app    |     at Generator.next (<anonymous>)
joplin-server  | 08:31:16 0|app    |     at /home/joplin/packages/server/dist/models/TaskStateModel.js:8:71 {
joplin-server  | 08:31:16 0|app    |   sql: undefined,
joplin-server  | 08:31:16 0|app    |   bindings: undefined,
joplin-server  | 08:31:16 0|app    |   originalStack: 'KnexTimeoutError: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?\n' +
joplin-server  | 08:31:16 0|app    |     '    at Client_SQLite3.acquireConnection (/home/joplin/packages/server/node_modules/knex/lib/client.js:332:26)\n' +
joplin-server  | 08:31:16 0|app    |     '    at runNextTicks (node:internal/process/task_queues:60:5)\n' +
joplin-server  | 08:31:16 0|app    |     '    at processTimers (node:internal/timers:509:9)\n' +
joplin-server  | 08:31:16 0|app    |     '    at Runner.ensureConnection (/home/joplin/packages/server/node_modules/knex/lib/execution/runner.js:305:28)\n' +
joplin-server  | 08:31:16 0|app    |     '    at Runner.run (/home/joplin/packages/server/node_modules/knex/lib/execution/runner.js:30:19)'
joplin-server  | 08:31:16 0|app    | }

Meanwhlie, the failing task seems to be the same one, because new tasks cannot be scheduled:

joplin-server  | 08:38:07 0|app    | 2024-09-20 08:38:07: [warn] TaskService: Tried to start #11 (Process shared items) but it was already running
joplin-server  | 08:38:18 0|app    | 2024-09-20 08:38:18: [warn] TaskService: Tried to start #11 (Process shared items) but it was already running
joplin-server  | 08:38:28 0|app    | 2024-09-20 08:38:28: [warn] TaskService: Tried to start #11 (Process shared items) but it was already running
joplin-server  | 08:38:38 0|app    | 2024-09-20 08:38:38: [warn] TaskService: Tried to start #11 (Process shared items) but it was already running
joplin-server  | 08:38:48 0|app    | 2024-09-20 08:38:48: [warn] TaskService: Tried to start #11 (Process shared items) but it was already running
joplin-server  | 08:38:58 0|app    | 2024-09-20 08:38:58: [warn] TaskService: Tried to start #11 (Process shared items) but it was already running

jade12 · 20 September 2024 10:03

[Update] No that's not it. I remove the scheduled tasks and the disk IO is still high.
Now it still seems like it's a general issue of docker itself.

@laurent ~~I have a theory now.~~

A task maybe scheduled by setInterval, which accumulates if the callback is too slow.

To avoid overlapping tasks, a new run is canceled if the previous one is still running.

However, the task state is loaded from disk, and maybe worse, it seems that it's loaded twice?.

This disk loading part can be slow, e.g. when the system is under high disk IO. In this case we'll spawn a new disk loader (i.e. database connection) every few seconds.

[Speculated, Not fully verified] Later, even if there is no longer external disk IO, the pending database connections themselves may still produce high disk IO, blocking new connections & scheduled tasks.

This matches with my observations:

joplin server only has high disk read, not write
lsof and strace only shows files about sqlite
This high-disk-io issue only happenes when my other matenance scripts (backup, indexing) are running.

To reproduce the issue, I just start multiple processed that read the disk simutaneouly. (I use rclone md5sum /some/dir myself)

I plan to verify my theory by disabling tasks #11 and #12.

jade12 · 20 September 2024 10:16

The sqlite file itself seems very small. I wonder if it is possible/appropriate to keep it in memory.

jade12 · 20 September 2024 10:19

Another idea is to replace setInterval with setTimeout, where each run schedules the next one at the end.

system · 20 October 2024 10:19

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Joplin Server sync Knex connection timeout errors Support	3	282	5 May 2024
What is KnexTimeoutError:? Lounge	0	45	15 October 2024
Joplin Server Issues with Synology NAS: Postgres Database Connection Problem Support	5	1930	23 May 2022
Joplin server performance comparative Features	15	1870	20 August 2021
Synchronization stopped working on Joplin Server v3.0.1 Support	5	94	27 February 2025

High disk read of joplin-server

Operating system

Joplin version

What issue do you have?

Related topics