GSoC 2026 Proposal Draft – Idea 4: Chat with your note collection using AI

Prachig · 31 March 2026 13:12

GSoC 2026 Proposal – Enhancing Editor Reliability, Build Stability, and Data Safety in Joplin

Title

GSoC 2026 Proposal Draft – Idea X: Enhancing Data Integrity, Editor Behavior, and User Safety in Joplin – Prach i

Links

Project Idea:
https://joplinapp.org/gsoc/id eas/

GitHub Profile:
https://github.com/p19 61999

LinkedIn Profile:
https://www.linkedin.com/in/prachi-gupta-pg 192106

1.Pull Requests & Releva nt Work

Joplin Main Repository Con tributions

C ontribution

Status

Links

Internal Image Drag-and-Drop Fix (https://github.com/laurent22/joplin/ pull/14774, Resol ves #1389 6)

Merged

https://github .com/laurent22/joplin/pull/14774

Prevent index.ts rewrite on Windows (https://github.com/laurent22/joplin /pull/14945, Reso lves #10933 )

Closed

https://git hub.com/laurent22/joplin/pull/14945

JEX Import Warning System (https://github.com/laurent22/j oplin/pull/14964, Resolves #14 791)

Closed

https://gi thub.com/laurent22/j oplin/issues/14791

Summary of Work

Area

Description

Editor (TinyMCE)

Implemented correct internal drag-and-drop handl ing, DOM manipulation, an d undo integration

Build System

Fixed cross-platform file rewriting iss ue using normalization a nd conditional writes

User Safety

Added JEX import warnin g system to prevent accidental data duplication

Other Experience:

React, Angular (Standalone Components, RxJS, UI architecture) - JavaScript/TypeScript developm ent - Experience debug ging cross-platform issues

2. Introduction

I am Prachi, a B.Tech graduate (2020) with 4+ years of professional experience across both frontend and backend development, specializing in JavaScript/TypeScript and modern web architectures. I have been actively contributing to the Joplin open-source project and have worked on real-world issues involving editor beha vior, build system stabi lity, and user data sa fety.

Contact I nformation

Field

Detai ls

Name

Prachi

Email

prachigupta19699@gmail.com

GitHub

https://github.com/p1961999

LinkedI n

https: //www.linkedin.com/in/pra chi-gupta-pg1921 06

Location

Delhi, I ndia

Degree

B. Tech (Graduated 2020) – G GSIPU

Programmin g Experience

Area

Technologies

Frontend

Angular, ReactJS , JavaScript, TypeScri pt, HTML, CSS

Backend

Node. js, Express, REST APIs

Databases

MongoDB, MySQL

Tools & P ractices

Gi t, Debugging, A PI Integration, Cross -platform fixes

Profession al Experience

Company

Role

Timeline

KentCam Technology

Frontend Developer

Dec 2021 – Apr 20 24

TripXL Pvt Ltd

Software Engineer(Full Stack Dev eloper)

M ay 2024 – Present

Professional Experience – Detailed Work

KentCam Technology — Frontend Developer

Dec 2021 – Apr 2024

At KentCam Technology, I developed a Distributor Management System using React, focused on managing distributor operations and imp roving visibility into inventory and pricing.

Built a React -based dashboard to manage import/export data across distributors.
Implemented tracking for import volume and inventory flow.
Developed modules to manage and display wholesale pricing data.
C reated reusable UI components and optimized state handling.
Integrated APIs for real-time updates and ensured efficient data flow.

This system improved operational transparency and helped stakeholders make data-driven decisions.

TripXL Pvt Ltd — Software Engineer (Full Stack Developer)

May 2024 – Present

At TripXL, I am working on a Flight Booking Customer Port al using React.

Built UI for searching and booking fligh ts.
Developed components for flight listing, selection , and booking flow.
Integrated APIs for real-time pricing an d availability.
Implem ented validations and optimized user experience.
Worked on frontend-backend integration for smooth data flow.

Open Source Experience

I have been actively contributing to Joplin and have worked on multiple issues including editor improvements, build system fixes, and user safety features. Through these co ntr ibutions, I have learn ed how to navigate a large producti on-level codebase, collaborate via pull requests, and follow open-source standards.

3. Project S ummary

Alignment with Joplin GSoC Ideas

My contributions and proposed work align closely with multiple Joplin GSoC focus areas, particu larly:

· Improving Editor Stability and UX– through my work on fixing internal drag-and-drop behavior in the Rich Text editor (Tin yMCE).

· Improving Development Workflow and Build Reliability– by resolving cross-platform inconsistencies in generated files on Windows.

· Improving Data Safety and User Experience– by implementing safeguards around JEX import behavior to prevent accidental data duplication.

These areas are directly connected to Joplin’s goals of improving reliability, usability, and con tributor experience. My proposal builds upon these existing contributions and extends them into a more structured and comprehensive improvement effort.

Problem Statement

J oplin, while being a powerful and widely used note-taking application, currently faces several issues that affect usability, reliability, and user trust:

Editor Behavior Issues:The Rich Text editor (TinyMCE) does not properly differentiate between internal and external drag-an d-drop operations. This leads to incorrect handling of image elements, resulting in duplicated content, broken formatting, and unintended HTML insertions.
Cross-Platform Build Inconsistencies:On Windows systems, generated files such as inde x.ts are rewritten during build processes even when there are no logical changes. This creates unnecessary Git diffs and complicates development workflows.
User Data Safety Risks:Many users misunderstand the behavior of JEX imports, assumi ng th at importing replaces existin g notes. Instead, it duplicates data, which can lead to large-scale duplication and synchronization issues across devices.

Importance of the Problem

Joplin is widely used by individuals who rely on it for managing large volumes of personal, academic, and professional notes. As the number of notes grows, organizing them manually becomes increasingly difficult and time-consuming. Users often end up with poorly structured notebooks or inconsistent tagging, which reduces the effectiveness of search and retrieval. This directly impacts productivity, as users spend more time locating information rather than utilizing it.

Another major challenge is that Joplin currently depends heavily on manual categorization. While this approach offers flexibility, it does not scale well for users with hundreds or thousands of notes. Many users either avoid organizing their notes altogether or apply inconsistent tagging strategies, leading to cluttered and inefficient data structures. This highlights the need for an intelligent system that can assist users in organizing their data without taking away control.

Additionally, with the increasing adoption of AI-powered productivity tools, users expect smarter features such as automatic categorization, semantic grouping, and intelligent suggestions. Without such capabilities, Joplin risks falling behind modern user expectations. However, implementing AI featu res must be done carefully to ensure user trust, data privacy, and non-destructive behavior, which makes this problem both important and technically challenging.

From a maintainer’s perspective, improving note organization reduces user frustration, decreases support requests, and enhances the overall usability of the applic ation. A well-design ed solution can also serve as a foundation for future intelligent features such as semantic search, summarization, and recommendation systems.

Proposed Solution

The proposed solution is to design and implement an AI-powered conversational interface for Joplin, allowing users to interact with their note collection using natural language. The system will be built as a modular Joplin plugin and will levera ge a Retrieva l-Augmented Gener ation (RAG) pipeline to provide accurate, context-aware responses.

1. Note Ingestion Laye r

Extract notes using Jopli n Data API
Collect:
- Note co ntent
- Titles
- Metadata (tags, not ebooks)
Handle edge cases like empty or larg e notes

2. Preprocessing & Chunking

Clean a nd normalize note content
Split long notes into semantic chunks
Pre serve structu re (headings, lists, sections)
Ma intain mapping between chunks and original notes

3. Embedding Generation

Convert note chunks into vector embeddings
Support:
- Local models (offli ne capability)
- API-based embeddin g providers
Cache embeddings to a void recomputation

4. Vector Storage & Retrieval

Store embeddings i n a searchabl e vector index
Convert user query int o embedding
Retrieve top-k r elevant note chunks
Rank and filter r esults for relevance

5 . RAG-Based Answer Generation

Combine:
User query
- Retrieved note cont ext
Generate answers using LLM
Ensure a nswers are:
- Context-aware
- Grounded in user notes
Provide source references for transparency

6. Conversational Chat I nterface

Add AI Chat pan el inside Joplin UI
Features:
- Ask q uestions about notes
- Follow-up queries
Multi-turn conversat ion
Designed similar to ChatGPT b ut scoped to user data

7. Source Citation Pane l

Display notes used in generating answers
Allow users to:
- View relevant excerp ts
- Open o riginal notes

Improves trust and explainability

8. AI Categorization Enhanc ement (Extension)

Use embedd ings for semantic clustering o f notes
Apply clustering al gorithms (e.g., K-Means)
Generate:
Suggested tags
- Notebook groupings
Provide suggestions in a review-first UI

9. Performance & Scalability

Batch processing of notes
Efficient embedding caching
Optimized retrieval for large datasets
Designed to handle large knowledge bases

The system introduces a chat-based interaction layer over Jopl in’s note collection, enabling users to retrieve and understand information more efficiently than traditional search. By using a RAG pipeline, the system ensures that answers are grounded in actual user notes rather than generic responses.

The inclusion of an AI categorization layer further enhances usability by organizing notes semantically, improving both retr ieval accuracy and long-term knowledge management.

A strong emphasis is placed on user control, transparency, and non-destructive behavior, ensuring that AI features augment the user experience without compromising trust.

Expected Outcome

The ex pected outcome of this project is a ro bust, scalable, and user-friendly AI-powered system that enables u sers to interact with their Joplin notes through a conversationa l interface while improving overall note organization and dis coverability.

1. Conversational Acces s to Notes

Users can ask questions about their notes in na tural language
Receive context-aware answers generated from their own data
Support for follow-up queries (multi-turn conve rsations)

2. Improved Informatio n Retrieval

Replace manual searching with sem antic retrieval (RAG-based)
Faster access to relevant information across multiple notes
Ability to retrieve insights even without exact ke yword matches

3. Source Transparency & Trust

Dis play source notes used in generating answer s
Allow users to verify and navigate to origin al content
Ensure explainability and reliabi lity of AI responses

4. Reduced Manual Effort

Minimize the need to manually search and o rganize notes

Automate discovery of relevant content
Save time for users with large knowledge bases

5. AI-P owered Categorization (Enhancement)

Automaticall y group notes using semantic similarity
Suggest:
- Tags
- Notebook stru ctures
Improve long-term organ ization and discoverability

6. User Contr ol & Data Safety

Non-destructive workflow
All AI-generated suggestions require user approval
No automati c modification of notes
Maintain user trust and data integrity

7. Scalable & Efficien t System

Designed to handle large note collections
Opti mized embedding, retrieval, and i ndexing pipeline
Effi cient performance across different system environments

8 . Cross-Platform Compat ibility

Works seamle ssly with Joplin’s existin g architecture
Compatible across d esktop environments (Windows, macOS, Linux)

9. Extensible Architecture

Modular plugin design
Enables fu ture enhancements such as:
- Semantic search
- Note summarization
- Recommendation systems
- AI-assisted workflows

10. Maintainer-Friendly Design

Minimal impact on core Joplin codebase
Clean separation of concerns
Easy to maintain, extend, and integrate

This project transforms Joplin from a traditional note-taking application into an intelligent knowledge system. By introducing a conversational interface powered by a RAG pipeline, users can interact with their notes in a more natural and efficient way.

At the same time, the addition of AI-driven categorization enhances how notes are structured and discovered over time. The system ensures a bala nce between automation an d user control by maintaining a review-first, non-destructive workflow.

Overall, the project significantly improves usability, productivity, and scalability while aligning with Joplin’s open-source principles and long-term vision for intelligent note management.

4. Technical Approach

System Workflow (High-Level)

Explanat ion:The system extracts not es using the Joplin Data AP I, converts them into embedd ings, and groups them using clustering algorithms. Suggested categories are generated via summarization and presented to the user for approval before a pplying any changes, ensuring a non-destructive workflow.

Component Architecture

![embedding-phase|275x426](file:///C:/Users/LENOVO/AppData/Local/Temp/msohtmlclip1/01/clip_image002.jpg)

Implementa tio n Details

4.1 Architecture Overview

Joplin is built using: - Electron for desktop applicatio n - React for UI - TinyMCE for Rich Text editing - Node.js-based tooling for build processes

The application follows a modular architecture where UI, services, and uti lities are clearly separated.

Relevance of My Previous Contributions

My prior contributions directly support this project:

· PR #14774 (Edito r Drag-and-Drop):Deep understanding of TinyMCE, DOM handling, and note structure—useful for extracting clean note content for AI processing.

· PR #14945 (Build Stability):Experience with cross-platform consistency and t ooling—important for handling AI dependencies and reproduc ible behavior.

· Issue #14791 (Import War ning):Focus on user safety—reflec ted here via a review-first, non-destructive AI workflow.

Validation Strategy (Early Phase)

· Validate embedding quality (semantic similarity accuracy)

· Benchmark clustering performance on real notes

· Evaluate memory and runtime constraints

4.2 Editor Behavior Improvements

The technical approach for the editor-related work is to make drag-and-drop handling more context-aware instead of relying entirely on the default TinyMCE behavior. At present, internal image movement insi de the Rich Text editor can be interpreted li ke a generic HTML or file drop, which causes incorrect insertion of content. To address this, I will intercept the relevant drag-and-drop events, identify whether the dragged content is an image already present within the editor, and then handle t hat case sep arately.

4.3 Feature Design & User Interaction Flow

The following table describes the proposed features, user interactions, and UI enhancements compared to the current Joplin interface. This helps visualize how the system improves usability while maintaining consistency with the existing layout.

4.3.1 AI Chat with Notes

Title

User Story Description (role, goal, motivation)

List of tasks needed to achieve the goal (User Journey)

Links to mocks / prototype

AI Chat Integration

As a Joplin user, I need an AI-powered chat interface so that I can ask questions about my notes and quickly retrieve insights without manual ly searching through multiple notes

1. Open a note in Joplin

![|210x106](file:///C:/Users/LENOVO/AppData/Local/Temp/msohtmlclip1/01/clip_image004.png)

2. Switch to AI Chat tab beside editor

3. Enter a question

4. System retrieves relevant note chunks

5. AI generates contextual answer

6. User can ask follow-up questions

4.3.2 Source Citation Panel

Title

User Story Description

List of tasks

Links to mocks

Source Transpa rency

As a user, I wa nt to see which notes were used in the AI response so that I can verify and trust the answer

1. Ask question in chat.

![|196x127](file:///C:/Users/LENOVO/AppData/Local/Temp/msohtmlclip1/01/clip_image006.png)

2. View AI response chat

3. Open Sources panel

4. Click note reference

5. Navigate to original note

4.4 Build System Improvements

For the build tooling issue, the technical approach is to make generated file writing deterministic across operating systems. The root cause of the problem is not a logical content change, but inconsistent line endings and unconditional file writes. To solve this, I will update the utility responsible for inserting generated content into files so that it first normalizes line endings, then compares the normalized new content with the existing file content, and only writes the file if there is a real change.

This approach kee ps generated files stable, avoids unnecessary rewrites on Windows, and prevents contributors from seeing false file modifications after running setup or build commands. The change is intentionally minimal and localized so that it improves reliability without affecting the broader build pipeline.

4.5 Import Warning System

For the import-related safety issue, the technical approach is centered on improving user guidance at the point where mistakes usually happen. Before the JEX import flow proceeds, the application will display a warning dialog explaining that importing is intended for adding or restoring notes and that syncing imported notes with an already-synced profile may create duplicates. This warning should appear before the file picker opens so that users understand the consequence before continuing.

To avoid making the experience repetitive, I will store the visibility state of this prompt in a private per-profile setting so that it is shown only the first time unless future product decisions r equire otherwise. In addition to the desktop prompt, I will add the same warning message to the relevant mobile configuration screen and update the welcome note documentation so that the guidance is visible both during action and in reference material. This ensures the fix is not only technical but also educational for users.

4.6 Comparative Analysis: Evolution Beyond Current LLM Baselines (Jarvis Case Study)

In my own joplin-plugin-ai-chat-on-notes I built a multi-provider abstraction layer both patterns I am applying here.

Jarvis operates on the currently open note w ith no batch embedding, no persistent vector i ndex, and no clustering. This pro posal builds that missing layer: embed every note, reduce dimensions with UMAP, discover semantic groupings through clustering, and surface them as actionable tag and notebook suggestions.

4.7 Challenges

· Handling TinyMCE behavior safely

· Maintaining backward compatibility

· Avoiding intrusive UX

4.8 Testing Strategy

The testing strategy will combine manual verification, regression testing, and targeted validation of edge cases. For the editor work, I will test different drag-and-drop scenarios such as moving images between paragraphs, headings, and other formatted content, while ensuring that no duplicate nodes or unintended wrappers are introduced. I will also verify that undo and redo continue to work correctly after the manual handling logic is added.

For the build- related changes, I will validate behavior on Windows and compare results with n on-Windows environm ents to ensure generated f iles are not rewritten unnecessa rily. For the import warnin g system, I will test first -time visibility, persistence of the setting, and the correctness of the warning flow across desktop and mobile UI con texts.

1. Unit Testing

I will add unit tests for:

note preprocessing utilities
chunking log ic
prompt-building lo gic
ranking / filtering helpers
categorization helpers

2. Integration Te sting

I will test:

note i ngestion from Joplin APIs
em bedding generation flow
vector index creation and updates
retrieval pipe line from question to context sel ection
UI state transitions in the chat interface

3. End-to-End Testing

I will validate full user journeys such as:

ingesting notes
as king a question
retrieving relevant chunks
generating a grounded ans wer
continuing with a follow-up question
reviewing cat egorization suggestions

4. Retrieval Quality Testing

To test whether retrieval is working well, I will prepare controlled note sets and evaluate:

whether top retrieved ch unks actually contain the answer
whether semantically related notes are surfaced for paraphrased questions
whether duplicate/noisy chunks are filtered out effectively

Possible m etrics and checks:

top-k relevance checks
manual preci sion review on sample question sets
comparison of retrieval results for exact-match vs semantic queries

5. Answer Grounding Testing

For the generation step, I will test:

whether the answer is based on retrieved context
whether unsupported claims appear when context is insufficient
whether the UI c an show source notes to support the answer

This can be validated by creating benchmark question sets where the expec ted supporting note is already known.

6. Cate gorization Testing

For the AI categorization layer, I will test:

cluster coherence on curated note gro ups
quality of suggested labels
usefulness of recommendations from a user perspective
stability of clustering as note count increases

Validation approach :

small manually labeled n ote collections
r epresentative-note inspection per cluster
measuring whether grouped note s are a ctually topically related

7. Performance and Scalability Testing

I will test behavior on:

small note collections
medium collections
large clipped or long-form note collections

This includes monitoring:

ingestion time
embedding generation time
retrieval latency
response latency in chat
memory overhead for indexing and clustering

4.9 Documentation Plan

The documentation for this project will be developed in parallel with the implementation, following a structured and incremental approach aligned with the weekly development phases. During the initial weeks (Week 1–4), the focus will be on documenting the overall system design, including the AI pipeline, data flow, and architectural decisions. This will include diagrams such as the embedding pipeline, clustering workflow, and plugin integration within Joplin. Early documentation will help validate design decisions with mentors and ensure clarity before full-scale implementation begins.

During the core development phase (Week 5–8), documentation will expand to include detailed explanations of each module, such as note extraction, pre-processing, embedding generation, clustering logic, and LLM-based labelling. Each component will be documented with clear descriptions of responsibilities, data flow, and integration points within the Joplin plugin system. Inline code documentation will also be maintained consistently to ensure that the codebase remains readable and maintainable for future contributors.

In the later stages (Week 9–11), the focus will shift toward user-facing documentation and usability guidance. This will include instructions on how to use the AI categorization feature, how to interpret suggestions, and how to safely apply or reject changes. Special attention will be given to explaining the non-destructive workflow and user control mechanisms, ensuring that users clearly understand how their data is handled. Additionally, troubleshooting steps and known limitations will be documented to improve user confidence and reduce support overhead.

Finally, during the last phase (Week 12), all documentation will be consolidated, reviewed, and refined based on mentor f eedback. This includes ensuring consistency across developer and user documentation, improving clarity, and aligning with Joplin’s official documentation standards. The final deliverable will include comprehensive developer documentation, user guides, and well-commented code, making the feature easy to maintain, extend, and adopt within the Joplin ecosystem.

5.Implementation Plan

The implementation will be carried out in a phased and iterative manner across the GSoC coding period. The goal is to reduce t echnical risk earl y, keep mentor feedback integrated throughout development, and ensure that each phase produces a stable and reviewable output. Since this project involves note processing, embeddings, clustering, labeling, and a user-facing review interface, the work will be divided into logical milestones so that core functionality is validated before moving to optimization and polish.

Weekly Timeline

Week 1–2 (Community Bonding & Design): - During the initial phase, I will focus on finalizing the technical design of the plugin in consultation with mentors and the community. This includes a deeper study of Joplin’s plugin architecture, note data APIs, React-based UI integration, and any constraints around storage, performance, and user interaction. I will review existing plugin patterns in Joplin and prepare the project structure for modular development.

In parallel, I will validate the design assumptions for the AI pipeline. This includes deciding the initial preprocessing strategy, comparing local embedding options versus API-based embedding generation, and defining a first dataset strategy for testing on sample notes. By the end of this phase, I aim to have a finalized architecture document, flow diagrams, module boundaries, and a clear technical roadmap approved by mentors.

Week 3 (Note Extraction and Data Preparation): - In this phase, I will begin implementation of the data ingestion layer. The primary task will be to fetch note content through Joplin’s Data API and prepare it for downstream processing. This includes selecting which note fields to use, handling metadata, and identifying edge cases such as empty notes, duplicate notes, notes with only titles, and large note bodies.

I will also implement pre-processing routines such as text cleaning, normalization, and chunk preparation where required. The objective of this week is to create a reliable input pipeline so that note data can be consistently transformed into a format suitable for embedding generation. At the end of this phase, I expect to have a working extraction pipeline that can process notes from a Joplin profile and output structured intermediate data for the embedding stage.

Week 4 (Embedding Integration and Initial Validation):- Once note extraction is stable, I will integrate the embedding generation layer. The focus here will be to convert note content into meaningful vector representations. I will begin with a modular design so that the system can support both local models and API-based providers where feasible. This week will also involve benchmarking different embedding approaches for quality, performance, and developer usability.

To reduce future risk, I will perform early validation on a controlled set of notes and compare whether semantically similar notes produce meaningful closeness in embedding space. This step is critical because the quality of clustering depends heavily on the quality of embeddings. By the end of the week, I expect to have a repeatable embedding pipeline with test outputs and performance observations documented.

Week 5–6 (Clustering Engine and Group Formation): This phase will focus on implementing the clustering layer that groups related notes based on their embeddings. I plan to start with K-Means clustering because it is relatively simple, interpretable, and suitable for initial experimentation. During this stage, I will test different values of K and analyze how cluster quality changes across note collections of different sizes and themes.

In addition to implementing the clustering logic, I will add validation utilities to inspect the coherence of each cluster. This may include selecting representative notes near centroids and examining whether groupings are practically meaningful. The expected result of this stage is a working grouping system that can cluster notes into candidate categories with reasonable semantic consistency.

Week 7 (Cluster Labeling and Suggestion Generation): - After cluster formation is working, the next step will be to generate human-readable suggestions from those clusters. This includes selecting representative notes from each group and using an LLM-based labeling strategy to generate suggested tag names or notebook names. The focus here is not only technical correctness but also usability: the generated category names should be understandable, concise, and relevant to the note groups they represent.

This stage will also include safeguards to avoid low-quality or overly generic labels. I will evaluate prompt structure, representative-note selection, and output formatting so that the suggestions remain useful and consistent. By the end of this week, the system should be able to produce suggested categories for clustered notes in a way that is ready to be surfaced in the UI.

Week 8-9 (User Review Panel and Interaction Flow): - Once categorization suggestions are generated successfully, I will implement the user-facing review panel inside the Joplin plugin interface. This panel will allow users to inspect generated clusters, review proposed labels, and selectively approve, reject, or modify the suggestions before anything is applied. This is one of the most important phases of the project because the system must remain non-destructive and user-controlled.

The UI will be designed to clearly show what changes are being suggested and how those changes will affect notes. I will focus on clarity, usability, and responsiveness. At the end of this phase, the plugin should support the complete interaction cycle: extract notes, process them, generate groups and labels, and present them to the user in a review-first workflow.

Week 10 (Applying Approved Changes and End-to-End Integration) :-

In this phase, I will connect the review UI with the final execution layer that applies user-approved tags or notebook assignments through Joplin APIs. The emphasis will be on making this step safe, predictable, and easy to audit. I will also ensure that rejected suggestions are ignored cleanly and that partially approved batches can still be applied without breaking the workflow.

This week will also serve as the first full end-to-end integration milestone. By this point, all core components should work together as one complete system. I will test the end-to-end pipeline on realistic note collections and fix any issues related to state management, UI coordination, or data application logic.

Week 11 (Optimization, Edge Cases, and Stability Improvements): - With the core functionality complete, I will dedicate this phase to improving performance and robustness. This includes optimizing processing for larger note collections, reducing unnecessary recomputation, improving batching, and refining storage or caching where useful. I will also address important edge cases such as multilingual notes, very short notes, noisy content, or highly overlapping note topics.

This phase is also where I will strengthen the overall stability of the plugin by refining internal error handling, improving fallback behavior, and reducing confusing outputs. The objective is to ensure that the feature is not only functional but also reliable enough for practical use by Joplin users.

Week 12 (Testing, Documentation, and Final Submission): - The final phase will be dedicated to comprehensive testing, documentation, and final clean up. I will conduct end-to-end testing across multiple scenarios, verify the accuracy and usability of the generated suggestions, and ensure that the plugin behaves correctly across supported environments. Any remaining feedback from mentors will be incorporated during this phase.

I will also finalize both developer-facing and user-facing documentation, including architecture notes, setup i nstructions, usage guidance, limitations, and future extension points. The final deliverables for this week will include cleaned-up code, updated documentation, finalized test coverage where practical, and submission-ready pull requests.

The implementation will be car ried out in a structur ed and iterative manner over the GSoC coding period, ensuring continuo us feedback and stability.

6. Deliverable s

At the end of the GSoC period, the following deliverables will be implemented as working, tested, a nd documented outputs. Required items represent the minimum successful outcome, while optional items may be completed ba sed on available time and progress.

Editor Impro vements

Deliverable

Description

Type

Int ernal drag-and-drop handling

Correct detection of internal image drag operations in TinyMCE editor

Requir ed

Controlled DO M movement

Move only the intended image node instead of inserting HTML f ragments

Required

Cleanu p mechanism

Rem ove empty or redundant p arent elements aft er drag operations

Requir ed

Undo/Redo integration

Integrate with TinyMCE u ndo manager to preserv e editing workflow

Require d

Edge case handling

Support nested ele ments, multiple image s, and mixed formatting scenar ios

Required

Build System Stability

Deliverable

Descripti on

Type

Line ending normalization

Normalize line ending s across platforms before file comparison

Required

Conditional file writing

Prevent rewr itin g files when content has not changed

Required

UTF-8 consistenc y

Ens ure consistent encoding during file wr ites

Required

Cross-platform validation

Verify behavio r across Windows, Linux, and macO S environments

Required

Developer workflow improve ment

Eli minate unnecessary Git diffs cau sed by generated files

Required

Import Warning System (User Safety)

Deliverable

Description

Type

Pre-import war ning dialog

Display warning before JE X import explaining duplication risks

Required

Persistent set ting

Show warning only once per profil e using internal configurati on

Required

Desk top integration

In tegrate warning into File → Import workflow before file selectio n

Require d

Mobile UI update

Add warning message to mobile configuration sc reen

Req uired

Documentation updat e

Update welcome note and help documentation for clarity

Required

AI Categorization Sy stem (Extended Contribution)

Deliverable

Description

Type

LLM-based labeling

Generate tag and not ebook suggestions fro m clusters

Required

Clustering engine

Group notes using K-Means cluster ing algorithm

Required

Text preprocessing module

Clean, no rmalize, and prepare note content for embeddings

Required

Mobile UI update

Add wa rning message to mobi le c onfiguration screen

Requ ired

Review panel UI

Allow users to revie w, accept, or reject suggestions

Required

Ca ching & optimization

Improve performance for large note collections

Required

Non-destructi ve workflow

Ensure no automatic changes without user approval

Required

Note extraction pipeline

Fetch notes via Joplin Data API with batchi ng support

Required

Testin g & Validation

Deliverable

Description

Type

M anual test coverage

Validate ed itor, build, and A I workflows across scenarios

Required

Regre ssion testing

Ensure no existing functi onality is broken

Required

Cross-platform testi ng

Validat e behavior on Windows and non-Windows systems

Required

Edge case validation

Test extreme cases like large notes and nested content

Required

Automated tests

Add unit/ integration tests whe re a pplicab le

Re quired

Documentation & Final Output

Deliverable

Description

Type

User documentation

Guide for new features and workflows

Required

Developer documentation

Architecture and implementation details

Required

Demo video

Demonstration of implemented features

Required

Inline code documentation

Maintain readability and future maintainability

Required

Final PR submissions

Clean, review-ready pull requests

Required

7. Availability

I am fully available for the entire GSoC 2026 coding period and will treat this program as a full-time commitment. I do not have any competing employment, internship, or academic obligations during this period. My focus will be entirely on delivering high-quality contributions to Joplin while actively engaging with mentors and the community.

I strongly believe in maintaining transparent and consistent communication throughout the developm ent cycle. If I enco unter any technical blockers or uncertainties, I will raise them immediately on the Joplin forum or Discord to e nsure timely resolution. Additional ly, I will maintain regular progress updates so that mentors can continuously track development and provide feedback at every stage of the project.

To ensure structured collaboration, I will f ollow a disciplined communication and repor ting approach. This includes weekly progress reports, early pull request submissions for incremental feedback, and regular mentor in teractions. My goal is to keep the development process iterative, visible, and aligned with mentor expecta tions.

Availability Details

Item

Details

Weekly av ailability

30- 32 hours per week during the coding period

Time zone

IST — UTC+5:30 (India)

Work management approach

B reak down tasks into milestones, follow structured planning, and ensure consistent weekly progress based on my 4+ ye ars of industry experience

Experience h andling workload

Experienced in managing full-time development workloads, deadlines, and iterative delivery in professional environments

Communica tion & Collaboration

Item

Details

Communication style

Weekly progress reports on Joplin forum, early draft PR submissions for feedback

Responsiveness

Same-day res ponses for async communicatio n

Blocker Handling

Blockers will be raised within 24 hours via forum/ Discord

Development approa ch

Iterative development with continuous mentor feedback and improvements

Open Source Motivation & Impact

Item

Details

Motiv ation for open source

Contributing to real-world software used by a global community while learning from experienced developers

Prior experience

Active contributor to Joplin with experience in editor, build system, and user safety improvements

Long-term commitment

Intend to continue contributing to Joplin beyond GSoC

Community impact

Aim to contribute meaningful features that improve reliability and user experience

Diversity & inclusion goal

Motivated to encourage more girls to participate in open source, as I have observed lower represen tation in my professional experience

Personal initiative

Will actively motivate and guide other girls to explore open source and build confidence in contributing

Why I am a Strong Candidate

I believe I am a strong can didate for this project due to a combination of relevant experience, proven contributions, and long-term commitment to Joplin.

Proven Contributions to Joplin:I have already contributed meaningful fixes to the Joplin codebase, including improvements in the Rich Text editor, build syst em stability, and user data safety. These contributions demonstrate my ability to understand complex issues, work within the existing architecture, and deliver practical solutions.
Strong Technical Background:With over 4+ years of experience in JavaScript and Angular, I have developed a solid understanding of frontend architecture, event handling, and cross-platform challenges. This directly aligns with Joplin’s Electron and React-based architecture .
Deep Understanding of the Codebase:Through my contributions, I have gained familiarity with Joplin’s internal structure, including editor integration (TinyMCE), build tools, and UI workflows. This reduces ramp-up time and allows me to focus on delivering results early in the program.
Consistency and Dedication:I have already started contributing before GSoC and plan to con tinue contributing after the program. I am committed to maintaining the features I build and supporting future improvements.
Clear Communication and Iteration:I actively engage in discussions, respond to feedback, and refine my solutions. I understand the importance of collaboration in open source and will work closely with mentors throughout the project.
Full-Time Commitment During GSoC:I will dedicate 40–45 hours per week during the program, ensuring steady progress and timely completion of milestones.

Overall, I bring a balance of technical capability, practical contribution experience, and long-term commitment, which makes me well-suited to successfully complete this project and continue contributing to Joplin beyond GSoC.

Topic		Replies	Views
Welcome to GSoC 2026 with Joplin! GSoC	154	2971	1 April 2026
What AI feature (if any) could be useful as part of Joplin? Lounge	30	2319	7 August 2025
About the Note Categorisation category Note Categorisation	0	40	7 May 2026
GSoC 2026: Opportunities for the AI projects GSoC	40	1347	19 June 2026
Summarize your notes with Joplin AI! Features	8	2443	30 March 2024

GSoC 2026 Proposal Draft – Idea 4: Chat with your note collection using AI

GSoC 2026 Proposal – Enhancing Editor Reliability, Build Stability, and Data Safety in Joplin

Title

Links

Joplin Main Repository Con tributions

Summary of Work

2. Introduction

Contact I nformation

Programmin g Experience

Profession al Experience

Professional Experience – Detailed Work

KentCam Technology — Frontend Developer

Dec 2021 – Apr 2024

TripXL Pvt Ltd — Software Engineer (Full Stack Developer)

Open Source Experience

3. Project S ummary

Alignment with Joplin GSoC Ideas

Problem Statement

Importance of the Problem

Proposed Solution

1. Note Ingestion Laye r

2. Preprocessing & Chunking

3. Embedding Generation

4. Vector Storage & Retrieval

5 . RAG-Based Answer Generation

6. Conversational Chat I nterface

7. Source Citation Pane l

8. AI Categorization Enhanc ement (Extension)

9. Performance & Scalability

Expected Outcome

1. Conversational Acces s to Notes

2. Improved Informatio n Retrieval

3. Source Transparency & Trust

4. Reduced Manual Effort

5. AI-P owered Categorization (Enhancement)

6. User Contr ol & Data Safety

7. Scalable & Efficien t System

8 . Cross-Platform Compat ibility

9. Extensible Architecture

10. Maintainer-Friendly Design

4. Technical Approach

System Workflow (High-Level)

Implementa tio n Details

4.1 Architecture Overview

Relevance of My Previous Contributions

Validation Strategy (Early Phase)

4.2 Editor Behavior Improvements

4.3 Feature Design & User Interaction Flow

4.3.2 Source Citation Panel

4.4 Build System Improvements

4.5 Import Warning System

4.7 Challenges

4.8 Testing Strategy

1. Unit Testing

2. Integration Te sting

3. End-to-End Testing

4. Retrieval Quality Testing

5. Answer Grounding Testing

6. Cate gorization Testing

7. Performance and Scalability Testing

4.9 Documentation Plan

5.Implementation Plan

Weekly Timeline

6. Deliverable s

7. Availability

Why I am a Strong Candidate

Related topics