"Row too big to fit into CursorWindow" errors when syncing from Joplin app on Android

Actually. Hmm.

Since I can't introspect my sync target directly (Joplin Cloud) I exported to both markdown + frontmatter and RAW - Joplin export directory

I then sorted by size.

In the MD directory, the largest .md or .html files were 3.2MB (md), 1.5MB (html) and 246KB (md).

In the RAW directory though the largest .md files were 3.2MB, 2.5MB, 1.5MB, and 246KB.

Why the discrepancy? OCR rendering of a PDF. Here's the top set of lines from the 2.5MB file ...

[redacted-filename].pdf

id: 6fdb241ef3b94e928b5638e7bdab9570
mime: application/pdf
filename: 
created_time: 2024-11-06T14:33:55.538Z
updated_time: 2024-11-06T16:32:31.035Z
user_created_time: 2024-11-06T14:33:55.538Z
user_updated_time: 2024-11-06T16:32:31.035Z
file_extension: pdf
encryption_cipher_text: 
encryption_applied: 0
encryption_blob_encrypted: 0
size: 7108096
is_shared: 0
share_id: 
master_key_id: 
user_data: 
blob_updated_time: 1730903635538
ocr_text: [and then all the OCR text]

Could perhaps OCR text be a source of unexpected behavior? None of these markdown files are spectacularly large, ultimately, but I can see internally referenced markdown blobs exploding in size if PDFs are being saved as such. (And of course, webclippings can get big.) Is this markdown being stored in the database as a data blob? Can the database handle an arbitrarily large markdown file? Is the DB on a phone more fragile than the DB on the desktop? (I suspect that is probably true.) Is that OCR being synced?

I am in the process of (a) turning off the OCR stuff and (b) temporarily deleting these resources. But then I need to wait (I think?) a day for the revision history to clear out and then I will try again to see if that resolves the issue.