I've just released my first plugin version of my paragraph extractor. If you're like me you have a lot of varied notes in Joplin covering a number of topics. This plugin allows you to search across any selected notes for a particular topic word or hashtag that is contained within a note's paragraph and the extractor will copy all identified paragraphs from all those notes to a single new note. The original notes are not modified in any way.
For example, let's say I'm doing research on planetary atmospheres, I can search for a specific word like 'neptune' or 'mesosphere' and any paragraph in any note I've selected that contains the word will be added to a single new note. Or, another example is that I have a lot of work notes spanning statuses and projects. I could then use a project name as the keyword and extract/create a single new note of all paragraphs that have any mention of that project from status notes, goals notes, task notes, etc.
Option to toggle adding keyword as tag to new note (unless it already exists)
Option to show dialog to enter keyword or hashtag and modify custom note title on right-click or keyboard shortcut by default (I didn't expect setting keyword or hashtag in plugin settings)
Option to use content blocks like the Note Overview plugin does to define, organize, and auto-update new note content
Search and extract by phrase - this does work now but I haven't tested it extensively
Search and extract from a specific notebook - great idea, I'll look into that
Option to toggle adding keyword as tag to new note (unless it already exists) - there is no toggle yet for that option - the tag is added in all cases, but I'll add it in the next version (this will be good to tie to extract by phrase since phrases shouldn't be tags in most cases)
Option to show dialog to enter keyword or hashtag and modify custom note title on right-click or keyboard shortcut by default (I didn't expect setting keyword or hashtag in plugin settings) - that is something already in the backlog for the next version
Option to use content blocks like the Note Overview plugin does to define, organize, and auto-update new note content - I'm not familiar with that - I'll take a look
PS - I want to thank @JackGruber for the inspiration!
I think that Joplin deserves more paragraph-level information extraction, and this is a step in the right direction.
in fact, this is similar in some ways to something that I started working on recently. let's see if it's still worth developing a slightly different variation on the same theme.
Paragraph Extractor has been updated to version 1.1.1
New additions:
A dialog box has been added to allow setting the keyword and tag to use as the paragraph extraction search. The keyword and tag are then saved as default values
You can now just select a notebook and use a toggle to extract paragraphs from all the notes within that specific notebook (right click on either a notebook or just a note within the notebook allows selecting all notes)
Paragraph Extractor has been updated to version 1.1.2
New additions:
Note paragraph blocks can now be extracted to a note with the Joplin tag title that matched either the hashtag or the keyword - similar to Logseq linked block references in tag notes
The extraction dialog box was modified to be more clear
My Paragraph Extractor plugin has been updated to v-1.1.5. There are quite a few deep features with this plugin, so I've created a quick 15 min tutorial video on how to use it:
Here are the changes since v-1.1.2
Added option to have backlinks to the parent note embedded at the end of each extracted paragraph
Added extraction of full page if the hashtag and keyword are at the end of the note
I'm glad you asked that question! I was just working through how that might operate! Tell me more about what you'd like to see!
Here is my current thinking - I'll add some metadata to the extracted note about the sources of extraction so that if you select any notes with that metadata, they will refresh with the new or updated text. This could get a bit complex especially if the original source note changes significantly. (I'll look at automating that later with some type of refresh period).
Do you need the reverse - to update the original note if the extracted note changes?
happy to see that we habe both the same toughts
Basically I want to avoid to write the same text twice
that we would be what? number of pharagraph in the note, count characteres etc. ?
Wouldn't it be save to mark the text and put in lable it. Allow to give it specific name or use guid, e.g.
[extractor] ...text ... [/extractor=my first reused pharagraph]
could be convenient but not sure, as you don't know where else the text is used. It may does not fit anymore at other places.
If you combine it with back-links you have chance to know where the text used.
Thanks for responding. The text would still be extracted / duplicated but could be refreshed if the sources change. I'm not sure this helps your glossary needs since my plugin extracts paragraphs and not selections of text. There wouldn't be any linking other than back to the originating notes.
I don't think I would wrap extracted text with identifiers - I would use markdown/html comments as metadata on the sources/paragraphs and setting in the form <!-- extraction source data here --> at the end of the extracted note. I already have a format I'm testing (e.g., paragraphs have guids, etc). I'm not sure what is available for plugin-to-plugin communication for sharing data. I would be great if Joplin has paragraph / block ids built-in.
My Paragraph Extractor plugin has been updated to v-1.2.1. I've added the ability to refresh existing extracted paragraph note blocks from their source notes. This allows you to go back to any source notes for a particular extracted note, make changes and refresh without having to create a new note.
I've created a quick tutorial here (if you haven't watched the plugin overview video in the comments above - be sure to do that first):
Unfortunately my company's IT is going nuts with restrictions. I cannot use joplin in my office at the moment (where I used it most)
So I'm not much of tester anymore
No problem - I appreciated your comments. Right now, there isn't a good way to keep track of paragraphs. The main issue is what the source paragraph is vs. the extracted one. What if both change? I wish there were paragraph block identifiers - but, Joplin isn't Logseq or RemNote - so those would have to be tracked with a lot of metadata.
BTW, the next version is going to add a diff function to show what was added/deleted in the original note and modified in the extracted note if the paragraphs don't match. That way, anything added or deleted in either note will be visible (if the option is chosen).