PDF printout plugin

mortenhaahr · 23 April 2023 13:29

I am currently working on a plugin that can convert a PDF to an image printout and insert the images into a note.
I have a working setup but it is very much hacked together, as it calls a Python script I've written for everything related to PDF conversion. Naturally, I would like to re-write this as a native Joplin Plugin.

I am currently stuck with this as I cannot get a minimal example with PDF.js to work.

The minimal example would be something like this:

var pdfjs = require("pdfjs-dist");
const doc = await pdfjs.getDocument('/<path-to-pdf>').promise;

The error I am getting is the following:

    ERROR in ./node_modules/pdfjs-dist/build/pdf.js 924:33
    Module parse failed: Unexpected token (924:33)
    You may need an appropriate loader to handle this file type, currently no loaders are configured to process this file. See https://webpack.js.org/concepts#loaders
    |   const httpHeaders = src.httpHeaders || null;
    |   const withCredentials = src.withCredentials === true;
    >   const password = src.password ?? null;
    |   const rangeTransport = src.range instanceof PDFDataRangeTransport ? src.range : null;
    |   const rangeChunkSize = Number.isInteger(src.rangeChunkSize) && src.rangeChunkSize > 0 ? src.rangeChunkSize : DEFAULT_RANGE_CHUNK_SIZE;
     @ ./src/index.ts 15:12-33

It seems to be related to the nullish coalescing operator not being defined. Asking ChatGPT and from trying to Google the answer, it seems like I have two options:

Change the loader in webpack.config.js to use Babel and include the plugin
Update the target versino inside tsconfig.json

I've had success with neither and would very much appreciate some help porting my script.

laurent · 23 April 2023 20:49

Can't you find a version of pdf.js that's compiled as commonJS? Or maybe build one yourself? Then you can require it as normal, without having to change the build system

mortenhaahr · 24 April 2023 05:44

That might be a better approach. I'll give it a shot and let you know of my findings.

Thanks!

SidMan · 6 February 2024 12:06

Did it work ?

mortenhaahr · 6 February 2024 12:27

I am afraid not. It ended up being beyond my TypeScript skills so I unfortunately abandoned the attempt.

If anyone wants to pick it up and want the Python script for inspiration then feel free to message me

SidMan · 6 February 2024 14:50

Can you share your python script ?

Did you search for other solutions that can convert pdf to images ?

personalizedrefriger · 6 February 2024 16:37

This GitHub issue is related — hopefully it will soon be easier to convert PDFs to images from plugins:

mortenhaahr · 7 February 2024 08:25

The Python script can be found here: GitHub - mortenhaahr/joplin-pdf-ocr-upload
It currently also runs OCR on the imported PDF and adds it as metadata to the images. I would recommend skipping this on an initial version of the plugin.

Furthermore, I have the Joplin Plugin which is located here: GitHub - mortenhaahr/JoplinPDFPrintout
It works (tested on Linux) if the user installs it as a dev plugin and successfully installs the Python script.
I am glad to help more if needed

mortenhaahr · 14 February 2024 09:44

I just learned of the Joplin OCR feature included in v2.14.6 so I guess that part is no longer necessary.

Topic		Replies	Views
How does Joplin export PDFs? Development	3	992	29 October 2024
Help with TOC Plugin Tutorial? Plugins	1	538	18 March 2022
How To load scripts for plugins during runtime Plugins	2	581	16 November 2021
Display pdf in Joplin? Support	14	5071	30 October 2020
Replace built-in PDF renderer with a library GSoC	2	584	11 April 2022

PDF printout plugin

Related topics