GSoC 2026 Proposal – Enhancing Editor Reliability, Build Stability, and Data Safety in Joplin
Title
GSoC 2026 Proposal Draft – Idea X: Enhancing Data Integrity, Editor Behavior, and User Safety in Joplin – Prachi
Links
Project Idea:
https://joplinapp.org/gsoc/ideas/
GitHub Profile:
LinkedIn Profile:
https://www.linkedin.com/in/prachi-gupta-pg192106
Joplin Main Repository Contributions
| Contribution | Staus | Links |
|---|---|---|
| Internal Image Drag-and-Drop Fix ( Desktop: Resolves #13896 : Dragging-and-dropping an image within a Joplin note t… by p1961999 · Pull Request #14774 · laurent22/joplin · GitHub , Resolves #13896) | CLOSED | Desktop: Resolves #13896 : Dragging-and-dropping an image within a Joplin note t… by p1961999 · Pull Request #14774 · laurent22/joplin · GitHub |
| Prevent index.ts rewrite on Windows ( Desktop: Resolves #10933: Prevent generated index.ts files from being rewritten on Windows by p1961999 · Pull Request #14945 · laurent22/joplin · GitHub , Resolves #10933) | OPEN | Desktop: Resolves #10933: Prevent generated index.ts files from being rewritten on Windows by p1961999 · Pull Request #14945 · laurent22/joplin · GitHub |
| JEX Import Warning System ( Import jex prompt opens by p1961999 · Pull Request #14964 · laurent22/joplin · GitHub , Resolves #14791) | CLOSED | Make it clear that importing a jex backup can duplicate your notes · Issue #14791 · laurent22/joplin · GitHub |
Summary of Work
| Area | Description |
|---|---|
| Editor (TinyMCE) | Implemented correct internal drag-and-drop handling, DOM manipulation, and undo integration |
| Build System | Fixed cross-platform file rewriting issue using normalization and conditional writes |
| User Safety | Added JEX import warning system to prevent accidental data duplication |
Other Experience:
- React, Angular (Standalone Components, RxJS, UI architecture) - JavaScript/TypeScript development - Experience debugging cross-platform issues
2. Introduction
I am Prachi, a B.Tech graduate (2020) with 4+ years of professional experience across both frontend and backend development, specializing in JavaScript/TypeScript and modern web architectures. I have been actively contributing to the Joplin open-source project and have worked on real-world issues involving editor behavior, build system stability, and user data safety.
Contact Information
| Field | Details |
|---|---|
| Name | Prachi |
| prachigupta19699@gmail.com | |
| GitHub | p1961999 (Prachi Gupta) · GitHub |
| https://www.linkedin.com/in/prachi-gupta-pg192106 | |
| Location | Delhi, India |
| Degree | B.Tech (Graduated 2020) – GGSIPU |
Programming Experience
| Area | Technologies |
|---|---|
| Frontend | Angular, ReactJS, JavaScript, TypeScript , HTML, CSS |
| Backend | Node.js, Express, REST APIs |
| Databases | MongoDB, MySQL |
| Tools & Practices | Git, Debugging , API Integration, Cross-platform fixes |
Professional Experience
| Company | Role | Timeline |
|---|---|---|
| KentCam Technology | Frontend Developer | Dec 2021 – Apr 2024 |
| TripXL Pvt Ltd | Software Engineer (Full Stack Developer) | May 2024 – Present |
Professional Experience – Detailed Work
- KentCam Technology — Frontend Developer
Dec 2021 – Apr 2024
At KentCam Technology, I developed a Distributor Management System using React, focused on managing distributor operations and improving visibility into inventory and pricing.
- Built a React-based dashboard to manage import/export data across distributors.
- Implemented tracking for import volume and inventory flow.
- Developed modules to manage and display wholesale pricing data.
- Created reusable UI components and optimized state handling.
- Integrated APIs for real-time updates and ensured efficient data flow.
This system improved operational transparency and helped stakeholders make data-driven decisions.
- TripXL Pvt Ltd — Software Engineer (Full Stack Developer)
May 2024 – Present
At TripXL, I am working on a Flight Booking Customer Portal using React.
- Built UI for searching and booking flights.
- Developed components for flight listing, selection, and booking flow.
- Integrated APIs for real-time pricing and availability.
- Implemented validations and optimized user experience.
- Worked on frontend-backend integration for smooth data flow.
Open Source Experience
I have been actively contributing to Joplin and have worked on multiple issues including editor improvements, build system fixes, and user safety features. Through these contributions, I have learned how to navigate a large production-level codebase, collaborate via pull requests, and follow open-source standards.
3. Project Summary
Alignment with Joplin GSoC Ideas
My contributions and proposed work align closely with multiple Joplin GSoC focus areas, particularly:
-
Improving Editor Stability and UX – through my work on fixing internal drag-and-drop behavior in the Rich Text editor (TinyMCE).
-
Improving Development Workflow and Build Reliability – by resolving cross-platform inconsistencies in generated files on Windows.
-
Improving Data Safety and User Experience – by implementing safeguards around JEX import behavior to prevent accidental data duplication.
These areas are directly connected to Joplin’s goals of improving reliability, usability, and contributor experience. My proposal builds upon these existing contributions and extends them into a more structured and comprehensive improvement effort.
Problem Statement
Joplin, while being a powerful and widely used note-taking application, currently faces several issues that affect usability, reliability, and user trust:
- Editor Behavior Issues: The Rich Text editor (TinyMCE) does not properly differentiate between internal and external drag-and-drop operations. This leads to incorrect handling of image elements, resulting in duplicated content, broken formatting, and unintended HTML insertions.
- Cross-Platform Build Inconsistencies: On Windows systems, generated files such as index.ts are rewritten during build processes even when there are no logical changes. This creates unnecessary Git diffs and complicates development workflows.
- User Data Safety Risks: Many users misunderstand the behavior of JEX imports, assuming that importing replaces existing notes. Instead, it duplicates data, which can lead to large-scale duplication and synchronization issues across devices.
Importance of the Problem
Joplin is widely used by individuals who rely on it for managing large volumes of personal, academic, and professional notes. As the number of notes grows, organizing them manually becomes increasingly difficult and time-consuming. Users often end up with poorly structured notebooks or inconsistent tagging, which reduces the effectiveness of search and retrieval. This directly impacts productivity, as users spend more time locating information rather than utilizing it.
Another major challenge is that Joplin currently depends heavily on manual categorization. While this approach offers flexibility, it does not scale well for users with hundreds or thousands of notes. Many users either avoid organizing their notes altogether or apply inconsistent tagging strategies, leading to cluttered and inefficient data structures. This highlights the need for an intelligent system that can assist users in organizing their data without taking away control.
Additionally, with the increasing adoption of AI-powered productivity tools, users expect smarter features such as automatic categorization, semantic grouping, and intelligent suggestions. Without such capabilities, Joplin risks falling behind modern user expectations. However, implementing AI features must be done carefully to ensure user trust, data privacy, and non-destructive behavior, which makes this problem both important and technically challenging.
From a maintainer’s perspective, improving note organization reduces user frustration, decreases support requests, and enhances the overall usability of the application. A well-designed solution can also serve as a foundation for future intelligent features such as semantic search, summarization, and recommendation systems.
Proposed Solution
The proposed solution is to design and implement an AI-powered conversational interface for Joplin, allowing users to interact with their note collection using natural language. The system will be built as a modular Joplin plugin and will leverage a Retrieval-Augmented Generation (RAG) pipeline to provide accurate, context-aware responses.
1. Note Ingestion Layer
- Extract notes using Joplin Data API
- Collect:
- Note content
- Titles
- Metadata (tags, notebooks)
- Handle edge cases like empty or large notes
2. Preprocessing & Chunking
- Clean and normalize note content
- Split long notes into semantic chunks
- Preserve structure (headings, lists, sections)
- Maintain mapping between chunks and original notes
3. Embedding Generation
- Convert note chunks into vector embeddings
- Support:
- Local models (offline capability)
- API-based embedding providers
- Cache embeddings to avoid recomputation
4. Vector Storage & Retrieval
- Store embeddings in a searchable vector index
- Convert user query into embedding
- Retrieve top-k relevant note chunks
- Rank and filter results for relevance
5. RAG-Based Answer Generation
- Combine:
- User query
- Retrieved note context
- Generate answers using LLM
- Ensure answers are:
- Context-aware
- Grounded in user notes
- Provide source references for transparency
6. Conversational Chat Interface
- Add AI Chat panel inside Joplin UI
- Features:
- Ask questions about notes
- Follow-up queries
- Multi-turn conversation
- Designed similar to ChatGPT but scoped to user data
7. Source Citation Panel
- Display notes used in generating answers
- Allow users to:
- View relevant excerpts
- Open original notes
- Improves trust and explainability
8. AI Categorization Enhancement (Extension)
- Use embeddings for semantic clustering of notes
- Apply clustering algorithms (e.g., K-Means)
- Generate:
- Suggested tags
- Notebook groupings
- Provide suggestions in a review-first UI
9. Performance & Scalability
- Batch processing of notes
- Efficient embedding caching
- Optimized retrieval for large datasets
- Designed to handle large knowledge bases
The system introduces a chat-based interaction layer over Joplin’s note collection, enabling users to retrieve and understand information more efficiently than traditional search. By using a RAG pipeline, the system ensures that answers are grounded in actual user notes rather than generic responses.
The inclusion of an AI categorization layer further enhances usability by organizing notes semantically, improving both retrieval accuracy and long-term knowledge management.
A strong emphasis is placed on user control, transparency, and non-destructive behavior, ensuring that AI features augment the user experience without compromising trust.
Expected Outcome
The expected outcome of this project is a robust, scalable, and user-friendly AI-powered system that enables users to interact with their Joplin notes through a conversational interface while improving overall note organization and discoverability.
1. Conversational Access to Notes
- Users can ask questions about their notes in natural language
- Receive context-aware answers generated from their own data
- Support for follow-up queries (multi-turn conversations)
2. Improved Information Retrieval
- Replace manual searching with semantic retrieval (RAG-based)
- Faster access to relevant information across multiple notes
- Ability to retrieve insights even without exact keyword matches
3. Source Transparency & Trust
- Display source notes used in generating answers
- Allow users to verify and navigate to original content
- Ensure explainability and reliability of AI responses
4. Reduced Manual Effort
- Minimize the need to manually search and organize notes
- Automate discovery of relevant content
- Save time for users with large knowledge bases
5. AI-Powered Categorization (Enhancement)
- Automatically group notes using semantic similarity
- Suggest:
- Tags
- Notebook structures
- Improve long-term organization and discoverability
6. User Control & Data Safety
- Non-destructive workflow
- All AI-generated suggestions require user approval
- No automatic modification of notes
- Maintain user trust and data integrity
7. Scalable & Efficient System
- Designed to handle large note collections
- Optimized embedding, retrieval, and indexing pipeline
- Efficient performance across different system environments
8. Cross-Platform Compatibility
- Works seamlessly with Joplin’s existing architecture
- Compatible across desktop environments (Windows, macOS, Linux)
9. Extensible Architecture
- Modular plugin design
- Enables future enhancements such as:
- Semantic search
- Note summarization
- Recommendation systems
- AI-assisted workflows
10. Maintainer-Friendly Design
- Minimal impact on core Joplin codebase
- Clean separation of concerns
- Easy to maintain, extend, and integrate
This project transforms Joplin from a traditional note-taking application into an intelligent knowledge system. By introducing a conversational interface powered by a RAG pipeline, users can interact with their notes in a more natural and efficient way.
At the same time, the addition of AI-driven categorization enhances how notes are structured and discovered over time. The system ensures a balance between automation and user control by maintaining a review-first, non-destructive workflow.
Overall, the project significantly improves usability, productivity, and scalability while aligning with Joplin’s open-source principles and long-term vision for intelligent note management.
- Technical Approach
System Workflow (High-Level)
Explanation: The system extracts notes using the Joplin Data API, converts them into embeddings, and groups them using clustering algorithms. Suggested categories are generated via summarization and presented to the user for approval before applying any changes, ensuring a non-destructive workflow.
Implementation Details
4.1 Architecture Overview
Joplin is built using: - Electron for desktop application - React for UI - TinyMCE for Rich Text editing - Node.js-based tooling for build processes
The application follows a modular architecture where UI, services, and utilities are clearly separated.
Relevance of My Previous Contributions
My prior contributions directly support this project:
PR #14774 (Editor Drag-and-Drop): Deep understanding of TinyMCE, DOM handling, and note structure—useful for extracting clean note content for AI processing.
PR #14945 (Build Stability): Experience with cross-platform consistency and tooling—important for handling AI dependencies and reproducible behavior.
Issue #14791 (Import Warning): Focus on user safety—reflected here via a review-first, non-destructive AI workflow.
Validation Strategy
- Validate embedding quality (semantic similarity accuracy)
- Benchmark clustering performance on real notes
- Evaluate memory and runtime constraints
4.2 Editor Behavior Improvements
The technical approach for the editor-related work is to make drag-and-drop handling more context-aware instead of relying entirely on the default TinyMCE behavior. At present, internal image movement inside the Rich Text editor can be interpreted like a generic HTML or file drop, which causes incorrect insertion of content. To address this, I will intercept the relevant drag-and-drop events, identify whether the dragged content is an image already present within the editor, and then handle that case separately.
4.3 Feature Design & User Interaction Flow
The following table describes the proposed features, user interactions, and UI enhancements compared to the current Joplin interface. This helps visualize how the system improves usability while maintaining consistency with the existing layout.
4.3.1 AI Chat with Notes
| Title | User Story Description (role, goal, motivation) | List of tasks needed to achieve the goal (User Journey) | Links to mocks / prototype |
|---|---|---|---|
| AI Chat Integration | As a Joplin user, I need an AI-powered chat interface so that I can ask questions about my notes and quickly retrieve insights without manually searching through multiple notes | 1. Open a note in Joplin | |
| 2. Switch to AI Chat tab beside editor | |||
| 3. Enter a question | ![ai_chat_figma | ||
| 4. System retrieves relevant note chunks | |||
| 5. AI generates contextual answer | |||
| 6. User can ask follow-up questions |
4.3.2 Source Citation Panel
| Title | User Story Description | List of tasks | Links to mocks |
|---|---|---|---|
| Source Transparency | As a user, I want to see which notes were used in the AI response so that I can verify and trust the answer | 1. Ask question in chat. | |
| 2. View AI response chat | |||
| 3. Open Sources panel | ![ChatGPT Image Mar 31, 2026, 01_00_41 PM | ||
| 4. Click note reference | |||
| 5. Navigate to original note |
4.4 Build System Improvements
For the build tooling issue, the technical approach is to make generated file writing deterministic across operating systems. The root cause of the problem is not a logical content change, but inconsistent line endings and unconditional file writes. To solve this, I will update the utility responsible for inserting generated content into files so that it first normalizes line endings, then compares the normalized new content with the existing file content, and only writes the file if there is a real change.
This approach keeps generated files stable, avoids unnecessary rewrites on Windows, and prevents contributors from seeing false file modifications after running setup or build commands. The change is intentionally minimal and localized so that it improves reliability without affecting the broader build pipeline.
4.5 Import Warning System
For the import-related safety issue, the technical approach is centered on improving user guidance at the point where mistakes usually happen. Before the JEX import flow proceeds, the application will display a warning dialog explaining that importing is intended for adding or restoring notes and that syncing imported notes with an already-synced profile may create duplicates. This warning should appear before the file picker opens so that users understand the consequence before continuing.
To avoid making the experience repetitive, I will store the visibility state of this prompt in a private per-profile setting so that it is shown only the first time unless future product decisions require otherwise. In addition to the desktop prompt, I will add the same warning message to the relevant mobile configuration screen and update the welcome note documentation so that the guidance is visible both during action and in reference material. This ensures the fix is not only technical but also educational for users.
4.6 Challenges
• Handling TinyMCE behavior safely
• Maintaining backward compatibility
• Avoiding intrusive UX
4.7 Testing Strategy
The testing strategy will combine manual verification, regression testing, and targeted validation of edge cases. For the editor work, I will test different drag-and-drop scenarios such as moving images between paragraphs, headings, and other formatted content, while ensuring that no duplicate nodes or unintended wrappers are introduced. I will also verify that undo and redo continue to work correctly after the manual handling logic is added.
For the build-related changes, I will validate behavior on Windows and compare results with non-Windows environments to ensure generated files are not rewritten unnecessarily. For the import warning system, I will test first-time visibility, persistence of the setting, and the correctness of the warning flow across desktop and mobile UI contexts.
- Unit Testing
I will add unit tests for:
• note preprocessing utilities
• chunking logic
• prompt-building logic
• ranking / filtering helpers
• categorization helpers - Integration Testing
I will test:
• note ingestion from Joplin APIs
• embedding generation flow
• vector index creation and updates
• retrieval pipeline from question to context selection
• UI state transitions in the chat interface - End-to-End Testing
I will validate full user journeys such as:
• ingesting notes
• asking a question
• retrieving relevant chunks
• generating a grounded answer
• continuing with a follow-up question
• reviewing categorization suggestions - Retrieval Quality Testing
To test whether retrieval is working well, I will prepare controlled note sets and evaluate:
• whether top retrieved chunks actually contain the answer
• whether semantically related notes are surfaced for paraphrased questions
• whether duplicate/noisy chunks are filtered out effectively
Possible metrics and checks:
• top-k relevance checks
• manual precision review on sample question sets
• comparison of retrieval results for exact-match vs semantic queries - Answer Grounding Testing
For the generation step, I will test:
• whether the answer is based on retrieved context
• whether unsupported claims appear when context is insufficient
• whether the UI can show source notes to support the answer
This can be validated by creating benchmark question sets where the expected supporting note is already known. - Categorization Testing
For the AI categorization layer, I will test:
• cluster coherence on curated note groups
• quality of suggested labels
• usefulness of recommendations from a user perspective
• stability of clustering as note count increases
Validation approach:
• small manually labeled note collections
• representative-note inspection per cluster
• measuring whether grouped notes are actually topically related - Performance and Scalability Testing
I will test behavior on:
• small note collections
• medium collections
• large clipped or long-form note collections
This includes monitoring:
• ingestion time
• embedding generation time
• retrieval latency
• response latency in chat
• memory overhead for indexing and clustering
4.9 Documentation Plan
The documentation for this project will be developed in parallel with the implementation, following a structured and incremental approach aligned with the weekly development phases. During the initial weeks (Week 1–4), the focus will be on documenting the overall system design, including the AI pipeline, data flow, and architectural decisions. This will include diagrams such as the embedding pipeline, clustering workflow, and plugin integration within Joplin. Early documentation will help validate design decisions with mentors and ensure clarity before full-scale implementation begins.
During the core development phase (Week 5–8), documentation will expand to include detailed explanations of each module, such as note extraction, pre-processing, embedding generation, clustering logic, and LLM-based labelling. Each component will be documented with clear descriptions of responsibilities, data flow, and integration points within the Joplin plugin system. Inline code documentation will also be maintained consistently to ensure that the codebase remains readable and maintainable for future contributors.
In the later stages (Week 9–11), the focus will shift toward user-facing documentation and usability guidance. This will include instructions on how to use the AI categorization feature, how to interpret suggestions, and how to safely apply or reject changes. Special attention will be given to explaining the non-destructive workflow and user control mechanisms, ensuring that users clearly understand how their data is handled. Additionally, troubleshooting steps and known limitations will be documented to improve user confidence and reduce support overhead.
Finally, during the last phase (Week 12), all documentation will be consolidated, reviewed, and refined based on mentor feedback. This includes ensuring consistency across developer and user documentation, improving clarity, and aligning with Joplin’s official documentation standards. The final deliverable will include comprehensive developer documentation, user guides, and well-commented code, making the feature easy to maintain, extend, and adopt within the Joplin ecosystem.
5.Implementation Plan
The implementation will be carried out in a phased and iterative manner across the GSoC coding period. The goal is to reduce technical risk early, keep mentor feedback integrated throughout development, and ensure that each phase produces a stable and reviewable output. Since this project involves note processing, embeddings, clustering, labeling, and a user-facing review interface, the work will be divided into logical milestones so that core functionality is validated before moving to optimization and polish.
Weekly Timeline
Week 1–2 (Community Bonding & Design): - During the initial phase, I will focus on finalizing the technical design of the plugin in consultation with mentors and the community. This includes a deeper study of Joplin’s plugin architecture, note data APIs, React-based UI integration, and any constraints around storage, performance, and user interaction. I will review existing plugin patterns in Joplin and prepare the project structure for modular development.
In parallel, I will validate the design assumptions for the AI pipeline. This includes deciding the initial preprocessing strategy, comparing local embedding options versus API-based embedding generation, and defining a first dataset strategy for testing on sample notes. By the end of this phase, I aim to have a finalized architecture document, flow diagrams, module boundaries, and a clear technical roadmap approved by mentors.
**Week 3 (**Note Extraction and Data Preparation): - In this phase, I will begin implementation of the data ingestion layer. The primary task will be to fetch note content through Joplin’s Data API and prepare it for downstream processing. This includes selecting which note fields to use, handling metadata, and identifying edge cases such as empty notes, duplicate notes, notes with only titles, and large note bodies.
I will also implement pre-processing routines such as text cleaning, normalization, and chunk preparation where required. The objective of this week is to create a reliable input pipeline so that note data can be consistently transformed into a format suitable for embedding generation. At the end of this phase, I expect to have a working extraction pipeline that can process notes from a Joplin profile and output structured intermediate data for the embedding stage.
Week 4 (Embedding Integration and Initial Validation):- Once note extraction is stable, I will integrate the embedding generation layer. The focus here will be to convert note content into meaningful vector representations. I will begin with a modular design so that the system can support both local models and API-based providers where feasible. This week will also involve benchmarking different embedding approaches for quality, performance, and developer usability.
To reduce future risk, I will perform early validation on a controlled set of notes and compare whether semantically similar notes produce meaningful closeness in embedding space. This step is critical because the quality of clustering depends heavily on the quality of embeddings. By the end of the week, I expect to have a repeatable embedding pipeline with test outputs and performance observations documented.
**Week 5–6 (**Clustering Engine and Group Formation): This phase will focus on implementing the clustering layer that groups related notes based on their embeddings. I plan to start with K-Means clustering because it is relatively simple, interpretable, and suitable for initial experimentation. During this stage, I will test different values of K and analyze how cluster quality changes across note collections of different sizes and themes.
In addition to implementing the clustering logic, I will add validation utilities to inspect the coherence of each cluster. This may include selecting representative notes near centroids and examining whether groupings are practically meaningful. The expected result of this stage is a working grouping system that can cluster notes into candidate categories with reasonable semantic consistency.
**Week 7 (**Cluster Labeling and Suggestion Generation): - After cluster formation is working, the next step will be to generate human-readable suggestions from those clusters. This includes selecting representative notes from each group and using an LLM-based labeling strategy to generate suggested tag names or notebook names. The focus here is not only technical correctness but also usability: the generated category names should be understandable, concise, and relevant to the note groups they represent.
This stage will also include safeguards to avoid low-quality or overly generic labels. I will evaluate prompt structure, representative-note selection, and output formatting so that the suggestions remain useful and consistent. By the end of this week, the system should be able to produce suggested categories for clustered notes in a way that is ready to be surfaced in the UI.
Week 8-**9 (**User Review Panel and Interaction Flow): - Once categorization suggestions are generated successfully, I will implement the user-facing review panel inside the Joplin plugin interface. This panel will allow users to inspect generated clusters, review proposed labels, and selectively approve, reject, or modify the suggestions before anything is applied. This is one of the most important phases of the project because the system must remain non-destructive and user-controlled.
The UI will be designed to clearly show what changes are being suggested and how those changes will affect notes. I will focus on clarity, usability, and responsiveness. At the end of this phase, the plugin should support the complete interaction cycle: extract notes, process them, generate groups and labels, and present them to the user in a review-first workflow.
Week 10 (Applying Approved Changes and End-to-End Integration) :-
In this phase, I will connect the review UI with the final execution layer that applies user-approved tags or notebook assignments through Joplin APIs. The emphasis will be on making this step safe, predictable, and easy to audit. I will also ensure that rejected suggestions are ignored cleanly and that partially approved batches can still be applied without breaking the workflow.
This week will also serve as the first full end-to-end integration milestone. By this point, all core components should work together as one complete system. I will test the end-to-end pipeline on realistic note collections and fix any issues related to state management, UI coordination, or data application logic.
Week 11 ( Optimization, Edge Cases, and Stability Improvements): - With the core functionality complete, I will dedicate this phase to improving performance and robustness. This includes optimizing processing for larger note collections, reducing unnecessary recomputation, improving batching, and refining storage or caching where useful. I will also address important edge cases such as multilingual notes, very short notes, noisy content, or highly overlapping note topics.
This phase is also where I will strengthen the overall stability of the plugin by refining internal error handling, improving fallback behavior, and reducing confusing outputs. The objective is to ensure that the feature is not only functional but also reliable enough for practical use by Joplin users.
Week 12 ( Testing, Documentation, and Final Submission): - The final phase will be dedicated to comprehensive testing, documentation, and final clean up. I will conduct end-to-end testing across multiple scenarios, verify the accuracy and usability of the generated suggestions, and ensure that the plugin behaves correctly across supported environments. Any remaining feedback from mentors will be incorporated during this phase.
I will also finalize both developer-facing and user-facing documentation, including architecture notes, setup instructions, usage guidance, limitations, and future extension points. The final deliverables for this week will include cleaned-up code, updated documentation, finalized test coverage where practical, and submission-ready pull requests.
The implementation will be carried out in a structured and iterative manner over the GSoC coding period, ensuring continuous feedback and stability.
6. Deliverables
At the end of the GSoC period, the following deliverables will be implemented as working, tested, and documented outputs. Required items represent the minimum successful outcome, while optional items may be completed based on available time and progress.
Editor Improvements
| Deliverable | Description | Type |
|---|---|---|
| Internal drag-and-drop handling | Correct detection of internal image drag operations in TinyMCE editor | Required |
| Controlled DOM movement | Move only the intended image node instead of inserting HTML fragments | Required |
| Cleanup mechanism | Remove empty or redundant parent elements after drag operations | Required |
| Undo/Redo integration | Integrate with TinyMCE undo manager to preserve editing workflow | Required |
| Edge case handling | Support nested elements, multiple images, and mixed formatting scenarios | Required |
Build System Stability
| Deliverable | Description | Type |
|---|---|---|
| Line ending normalization | Normalize line endings across platforms before file comparison | Required |
| Conditional file writing | Prevent rewriting files when content has not changed | Required |
| UTF-8 consistency | Ensure consistent encoding during file writes | Required |
| Cross-platform validation | Verify behavior across Windows, Linux, and macOS environments | Required |
| Developer workflow improvement | Eliminate unnecessary Git diffs caused by generated files | Required |
Import Warning System (User Safety)
| Deliverable | Description | Type |
|---|---|---|
| Pre-import warning dialog | Display warning before JEX import explaining duplication risks | Required |
| Persistent setting | Show warning only once per profile using internal configuration | Required |
| Desktop integration | Integrate warning into File → Import workflow before file selection | Required |
| Mobile UI update | Add warning message to mobile configuration screen | Required |
| Documentation update | Update welcome note and help documentation for clarity | Required |
AI Categorization System (Extended Contribution)
| Deliverable | Description | Type |
|---|---|---|
| LLM-based labeling | Generate tag and notebook suggestions from clusters | Required |
| Clustering engine | Group notes using K-Means clustering algorithm | Required |
| Text preprocessing module | Clean, normalize, and prepare note content for embeddings | Required |
| Review panel UI | Allow users to review, accept, or reject suggestions | Required |
| Caching & optimization | Improve performance for large note collections | Required |
| Non-destructive workflow | Ensure no automatic changes without user approval | Required |
| Note extraction pipeline | Fetch notes via Joplin Data API with batching support | Required |
Testing & Validation
| Deliverable | Description | Type |
|---|---|---|
| Manual test coverage | Validate editor, build, and AI workflows across scenarios | Required |
| Regression testing | Ensure no existing functionality is broken | Required |
| Cross-platform testing | Validate behavior on Windows and non-Windows systems | Required |
| Edge case validation | Test extreme cases like large notes and nested content | Required |
| Automated tests | Add unit/integration tests where applicable | Required |
Documentation & Final Output
| Deliverable | Description | Type |
|---|---|---|
| User documentation | Guide for new features and workflows | Required |
| Developer documentation | Architecture and implementation details | Required |
| Demo video | Demonstration of implemented features | Required |
| Inline code documentation | Maintain readability and future maintainability | Required |
| Final PR submissions | Clean, review-ready pull requests | Required |
7. Availability
I am fully available for the entire GSoC 2026 coding period and will treat this program as a full-time commitment. I do not have any competing employment, internship, or academic obligations during this period. My focus will be entirely on delivering high-quality contributions to Joplin while actively engaging with mentors and the community.
I strongly believe in maintaining transparent and consistent communication throughout the development cycle. If I encounter any technical blockers or uncertainties, I will raise them immediately on the Joplin forum or Discord to ensure timely resolution. Additionally, I will maintain regular progress updates so that mentors can continuously track development and provide feedback at every stage of the project.
To ensure structured collaboration, I will follow a disciplined communication and reporting approach. This includes weekly progress reports, early pull request submissions for incremental feedback, and regular mentor interactions. My goal is to keep the development process iterative, visible, and aligned with mentor expectations.
Availability Details
| Item | Details |
|---|---|
| Weekly availability | 30-32 hours per week during the coding period |
| Time zone | IST — UTC+5:30 (India) |
| Work management approach | Break down tasks into milestones, follow structured planning, and ensure consistent weekly progress based on my 4+ years of industry experience |
| Experience handling workload | Experienced in managing full-time development workloads, deadlines, and iterative delivery in professional environments |
Communication & Collaboration
| Item | Details |
|---|---|
| Communication style | Weekly progress reports on Joplin forum, early draft PR submissions for feedback |
| Responsiveness | Same-day responses for async communication |
| Blocker Handling | Blockers will be raised within 24 hours via forum/Discord |
| Development approach | Iterative development with continuous mentor feedback and improvements |
Open Source Motivation & Impact
| Item | Details |
|---|---|
| Motivation for open source | Contributing to real-world software used by a global community while learning from experienced developers |
| Prior experience | Active contributor to Joplin with experience in editor, build system, and user safety improvements |
| Long-term commitment | Intend to continue contributing to Joplin beyond GSoC |
| Community impact | Aim to contribute meaningful features that improve reliability and user experience |
| Diversity & inclusion goal | Motivated to encourage more girls to participate in open source, as I have observed lower representation in my professional experience |
| Personal initiative | Will actively motivate and guide other girls to explore open source and build confidence in contributing |
Why I am a Strong Candidate
I believe I am a strong candidate for this project due to a combination of relevant experience, proven contributions, and long-term commitment to Joplin.
- Proven Contributions to Joplin: I have already contributed meaningful fixes to the Joplin codebase, including improvements in the Rich Text editor, build system stability, and user data safety. These contributions demonstrate my ability to understand complex issues, work within the existing architecture, and deliver practical solutions.
- Strong Technical Background: With over 4+ years of experience in JavaScript and Angular, I have developed a solid understanding of frontend architecture, event handling, and cross-platform challenges. This directly aligns with Joplin’s Electron and React-based architecture.
- Deep Understanding of the Codebase: Through my contributions, I have gained familiarity with Joplin’s internal structure, including editor integration (TinyMCE), build tools, and UI workflows. This reduces ramp-up time and allows me to focus on delivering results early in the program.
- Consistency and Dedication: I have already started contributing before GSoC and plan to continue contributing after the program. I am committed to maintaining the features I build and supporting future improvements.
- Clear Communication and Iteration: I actively engage in discussions, respond to feedback, and refine my solutions. I understand the importance of collaboration in open source and will work closely with mentors throughout the project.
- Full-Time Commitment During GSoC: I will dedicate 40–45 hours per week during the program, ensuring steady progress and timely completion of milestones.
Overall, I bring a balance of technical capability, practical contribution experience, and long-term commitment, which makes me well-suited to successfully complete this project and continue contributing to Joplin beyond GSoC.