jade12
19 September 2024 10:41
1
Operating system
Linux
Joplin version
2.14.2
What issue do you have?
I'm running joplin server in a docker. It constantly read file at 70~100MB/s, which can be confirmed by htop
, iotop
and docker stats
.
Currently I only found this error
joplin-server | 09:31:51 0|app | 2024-09-19 09:31:43: [error] TaskService: Failed running task #12 (Process emails) KnexTimeoutError: Knex: Timeout acquiring a connection. The pool is probably full
. Are you missing a .transacting(trx) call?
joplin-server | 09:31:51 0|app | at Object.queryBuilder (/home/joplin/packages/server/node_modules/knex/lib/knex-builder/make-knex.js:112:26)
joplin-server | 09:31:51 0|app | at createQueryBuilder (/home/joplin/packages/server/node_modules/knex/lib/knex-builder/make-knex.js:320:26)
joplin-server | 09:31:51 0|app | at TaskStateModel.knex [as db_] (/home/joplin/packages/server/node_modules/knex/lib/knex-builder/make-knex.js:101:12)
joplin-server | 09:31:51 0|app | at TaskStateModel.<anonymous> (/home/joplin/packages/server/src/models/TaskStateModel.ts:16:15)
joplin-server | 09:31:51 0|app | at Generator.next (<anonymous>)
joplin-server | 09:31:51 0|app | at /home/joplin/packages/server/dist/models/TaskStateModel.js:8:71 {
joplin-server | 09:31:51 0|app | sql: undefined,
joplin-server | 09:31:51 0|app | bindings: undefined,
joplin-server | 09:31:51 0|app | originalStack: 'KnexTimeoutError: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?\n' +
joplin-server | 09:31:51 0|app | ' at Client_SQLite3.acquireConnection (/home/joplin/packages/server/node_modules/knex/lib/client.js:332:26)\n' +
joplin-server | 09:31:51 0|app | ' at runNextTicks (node:internal/process/task_queues:60:5)\n' +
joplin-server | 09:31:51 0|app | ' at processTimers (node:internal/timers:509:9)\n' +
joplin-server | 09:31:51 0|app | ' at Runner.ensureConnection (/home/joplin/packages/server/node_modules/knex/lib/execution/runner.js:305:28)\n' +
joplin-server | 09:31:51 0|app | ' at Runner.run (/home/joplin/packages/server/node_modules/knex/lib/execution/runner.js:30:19)'
joplin-server | 09:31:51 0|app | }
And later I saw this consistently in the log: 2024-09-19 10:30:00: [warn] TaskService: Tried to start #12 (Process emails) but it was already running
Note: I saw a similar error/exception for #11 (Process shared items), but followup runs were successful.
I also verified that there are enough free memory, so this is probaby not about swap.
I'm still investigating, and I'll try v3 later.
Questions:
Is this a known issue?
What is the easist way of disabling task #12 ?
How can I check which file is being read? strace
showed something about sqlite journals, but I think it's for #11
jade12
19 September 2024 10:51
2
About swap, I only verified the system-wide free memory. Perhaps it is possible that joplin server is using all memory within the docker limit? Not sure how I can check that.
jade12
19 September 2024 20:26
3
I investigated further. It seems to be a general problem of docker. I'm closing it for now.
Continuing invetigation.
I observed that task #11 is constantly failing:
joplin-server | 08:31:16 0|app | 2024-09-20 08:31:08: [error] TaskService: Failed running task #11 (Process shared items) KnexTimeoutError: Knex: Timeout acquiring a connection. The pool is proba
bly full. Are you missing a .transacting(trx) call?
joplin-server | 08:31:16 0|app | at Object.queryBuilder (/home/joplin/packages/server/node_modules/knex/lib/knex-builder/make-knex.js:112:26)
joplin-server | 08:31:16 0|app | at createQueryBuilder (/home/joplin/packages/server/node_modules/knex/lib/knex-builder/make-knex.js:320:26)
joplin-server | 08:31:16 0|app | at TaskStateModel.knex [as db_] (/home/joplin/packages/server/node_modules/knex/lib/knex-builder/make-knex.js:101:12)
joplin-server | 08:31:16 0|app | at TaskStateModel.<anonymous> (/home/joplin/packages/server/src/models/TaskStateModel.ts:16:15)
joplin-server | 08:31:16 0|app | at Generator.next (<anonymous>)
joplin-server | 08:31:16 0|app | at /home/joplin/packages/server/dist/models/TaskStateModel.js:8:71 {
joplin-server | 08:31:16 0|app | sql: undefined,
joplin-server | 08:31:16 0|app | bindings: undefined,
joplin-server | 08:31:16 0|app | originalStack: 'KnexTimeoutError: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?\n' +
joplin-server | 08:31:16 0|app | ' at Client_SQLite3.acquireConnection (/home/joplin/packages/server/node_modules/knex/lib/client.js:332:26)\n' +
joplin-server | 08:31:16 0|app | ' at runNextTicks (node:internal/process/task_queues:60:5)\n' +
joplin-server | 08:31:16 0|app | ' at processTimers (node:internal/timers:509:9)\n' +
joplin-server | 08:31:16 0|app | ' at Runner.ensureConnection (/home/joplin/packages/server/node_modules/knex/lib/execution/runner.js:305:28)\n' +
joplin-server | 08:31:16 0|app | ' at Runner.run (/home/joplin/packages/server/node_modules/knex/lib/execution/runner.js:30:19)'
joplin-server | 08:31:16 0|app | }
Meanwhlie, the failing task seems to be the same one, because new tasks cannot be scheduled:
joplin-server | 08:38:07 0|app | 2024-09-20 08:38:07: [warn] TaskService: Tried to start #11 (Process shared items) but it was already running
joplin-server | 08:38:18 0|app | 2024-09-20 08:38:18: [warn] TaskService: Tried to start #11 (Process shared items) but it was already running
joplin-server | 08:38:28 0|app | 2024-09-20 08:38:28: [warn] TaskService: Tried to start #11 (Process shared items) but it was already running
joplin-server | 08:38:38 0|app | 2024-09-20 08:38:38: [warn] TaskService: Tried to start #11 (Process shared items) but it was already running
joplin-server | 08:38:48 0|app | 2024-09-20 08:38:48: [warn] TaskService: Tried to start #11 (Process shared items) but it was already running
joplin-server | 08:38:58 0|app | 2024-09-20 08:38:58: [warn] TaskService: Tried to start #11 (Process shared items) but it was already running
jade12
20 September 2024 10:03
4
[Update] No that's not it. I remove the scheduled tasks and the disk IO is still high.
Now it still seems like it's a general issue of docker itself.
@laurent I have a theory now.
A task maybe scheduled by setInterval , which accumulates if the callback is too slow.
To avoid overlapping tasks, a new run is canceled if the previous one is still running .
However, the task state is loaded from disk , and maybe worse, it seems that it's loaded twice? .
This disk loading part can be slow, e.g. when the system is under high disk IO. In this case we'll spawn a new disk loader (i.e. database connection) every few seconds.
[Speculated, Not fully verified] Later, even if there is no longer external disk IO, the pending database connections themselves may still produce high disk IO, blocking new connections & scheduled tasks.
This matches with my observations:
joplin server only has high disk read, not write
lsof
and strace
only shows files about sqlite
This high-disk-io issue only happenes when my other matenance scripts (backup, indexing) are running.
To reproduce the issue, I just start multiple processed that read the disk simutaneouly. (I use rclone md5sum /some/dir
myself)
I plan to verify my theory by disabling tasks #11 and #12 .
jade12
20 September 2024 10:16
5
The sqlite file itself seems very small. I wonder if it is possible/appropriate to keep it in memory.
jade12
20 September 2024 10:19
6
Another idea is to replace setInterval
with setTimeout
, where each run schedules the next one at the end.