Joplin Voice-to-Note Solution (IOS/Android + Windows)

executed · 14 November 2024 21:16

Make your Joplin notebooks hands-free!

With this solution, you can dictate text directly from your IOS/Watch OS/Android and have it automatically added to your Joplin notebook.
It is ideal for users who need to quickly jot down notes while on the go or for those who prefer dictation over typing.
Transcription will be available on the desktop app in your favorite notebook once you're back on your PC.

Disclaimer

Don't expect to see the guide on how to install this at the end of this overview.
This solution requires your likes on the post to analyze the impact and decide whether to move forward.

Features

Hands-free dictation: Seamlessly dictate text directly to your Joplin notebooks.
Works with Windows + IOS/Android: Use your iPhone/Apple Watch/Android as a microphone to add notes wherever you are.
Fast and efficient: Say goodbye to typing — just speak and see your text added to your Joplin notebook when you're back to your desktop.
Dashboard: have an overview of your voice inbox in your favorite note

Prerequisites

Joplin Desktop (Windows)
IOS/Watch OS/Android
Open AI token for speech recognition (very cheap, quick setup, guide included)

Components

Open AI account with at least 5$ on balance - is going to be used for speech recognition. Costs in my experience ~1 cent/dictation on average.
iPhone/Apple Watch/Android- we'll create a shortcut here that records your speech
AirTable - web service with a generous free tier (no payment card required) and both in-transit and at-rest encryption that stores our transcriptions. Up to 2 weeks of snapshots for data recovery;
Joplin Desktop app (Windows) - we'll configure it to listen to incoming requests by utilizing its WebClipper API.
Note Overview plugin for Joplin (optional) - allows us to have kind of a dashboard where all your voice inbox is displayed
Python Sync Script - can be copied from the repo somewhere on PC. It's going to read all available transcriptions from AirTable and push it to a specific Joplin notebook.
Task Scheduler (Windows) - Windows native app that lets us call synchronization script automatically when you log in to the PC, or based on a configured schedule.

How It Works

On your phone, you tap a button/call by voice the shortcut (automation)
Your phone records the audio
Your phone asks the Open AI Whisper model to transcribe the recording
Your phone sends transcription to the AirTable database
When you log in to PC (or per schedule) your PC asks AirTable for available notes using Python Sync Script
Script adds all collected notes to Joplin Desktop's configured notebook through WebClipper API.
Script marks collected notes as "Processed" in the AirTable database so they are not processed the next time the script is triggered
Script keeps the total transcriptions count in AirTable to be around 900 records to comply with free tier limits

Dashboard

Using the "Note Overview" plugin you can filter your voice notes inbox and display it in a table.
Example configuration:

<!-- note-overview-plugin
search: notebook:Unsorted -title:"Joplin Voice Inbox"
fields: title, updated_time
sort: updated_time DESC
-->
<!--endoverview-->

Result:

Limits

Sync Times to Notes/Day ratio

If you lock your PC each time you step away then it makes sense to configure the synchronization on system login.
If you don't have a lock screen it makes sense to configure the synchronization based on schedule.
In either of these scenarios, you might need to understand the Sync/day to Notes/day estimated ratio.
Because of AirTable free tier monthly limitations here's a rough estimate of how many voice notes per day we can make & how many system logins (synchronizations) that relate to:

Sync/day	Notes/day
3	26
4	24
5	22
6	20
7	18
8	16
9	14
10	12
11	10
12	8
13	6
14	4
15	2

These limits are rough estimates assuming a heavy user that uses Joplin every day of the given month. Each day off slightly increases the number of voice notes you can make till the end of the month.

How To Setup?

As mentioned in the disclaimer this solution would require the author some time to prepare a good setup guide so it's easy for the average Joplin user.

If you like the idea please like the post for the author to analyze the potential impact and bookmark this thread to stay updated if an author decides to add steps to the How To Setup? section .

muzak · 15 November 2024 02:08

Since you have planned this solution - which you will likely benefit from using yourself - why not start analyzing its potential impact and drafting a setup guide now? This will ensure it works well for you and expedite others' ability to try it.

executed · 15 November 2024 03:00

Well, I already have a working setup that uses self-hosted dockerized JSON storage instead of AirTable. It has some limitations and possible data loss which I'm fine with, but not sure about others.

I was describing steps to replicate it when I realized things are getting too complicated for the average Joplin user.
So I ended up testing AirTable API and realized it wouldn't be too much hassle to replace self-hosted storage with AirTable. On the other hand, I would need to spend at least 2 days testing and creating a guide for it.

So before going all in I decided to post a design here to see a couple of things:

how many people are interested in it;
get an actual review of the design to identify its flaws;
perhaps there are existing solutions that I couldn't find on a forum;
perhaps somebody knows how OpenAI can be easily replaced with a free solution;
perhaps Joplin plans to release a native feature soon.

segdy · 15 November 2024 07:12

Thanks, this inspires me!

You don't have to give a full fledged tutorial, but just some more details would help:

1.) How does your iOS shortcut look like? I only see Create Recording but I don't see an option to save that into a file or how to send it to OpenAI

2.) Would you mind sharing the python script (possibly along with the iOS automation)?

Just some thoughts how I might modify your idea:

1.) I think OpenAI is triggered by your iOS shortcut, right? I might actually just do the recording and upload it somewhere. The post processing would be done on a script. That has the advantage that I can still record if I don't have internet (e.g. when hiking and I get an idea)

2.) Instead of the web clipper API on windows, I'm considering using joplin-cli in a cron job script on my Linux server. Advantage is it's always on and I think simpler

executed · 15 November 2024 15:34

Thank you for the feedback.

I think your point mentioning hiking is valid. There should be a way to delay OpenAI transcription by storing the file locally and sending it for transcription later when a connection is back.

Also, I did not realize that the Joplin server doesn't have WebClipper. I think instead of WebClipper API calls in a script we could enhance it with the Joplin CLI based on Joplin deployment type.

P.S. not ready sharing python script. I can give you an idea on what IOS shortcut looks like + what API endpoints of AirTable are triggered in python script.

IOS Shortcut (note that FIFO queue means Airtable JSON storage in discussed design):

Airtable Endpoints:

GET
find all
https://api.airtable.com/v0/appID/tblLID
POST
push one
https://api.airtable.com/v0/appID/tblLID
Body example:

{
    "records": [
      {
        "fields": {
          "text": "lorem ipsum 6"
        }
      }
    ]
  }

PATCH
set processed (when transcriptions were read by script and does not need to be processed again)
https://api.airtable.com/v0/appID/tblLID

{
    "records": [
      {
        "fields": {
            "processed": true
        },
        "id": "recWTfJ4lsNP0d1hu"
      },
      {
        "fields": {
            "processed": true
        },
        "id": "rec3tijZTY1wxGgIy"
      }
    ]
  }

GET
find all non-processed (filter out already synced transcriptions). Filter is encoded string "processed != 1"
https://api.airtable.com/v0/appID/tblLID?filterByFormula=processed%20!%3D%201

dhwalker · 19 November 2024 00:24

While I generally prefer typing to speech for note taking, I've often wanted speech input for short, quick things (e.g., "Parked in space A-23"). I don't think, though, I'd use one that has significant latency (e.g., needs connectivity), as such notes also tend to have a short useful lifetime. Understanding that it's probably a considerably larger task and require more installed resources, I'd much rather have something that runs locally on my device.

executed · 19 November 2024 00:43

I agree. Native mobile shortcut exposed by the Joplin app itself would be great not only for long-term note-taking with an intent of further dispatch to designated notebooks but also for short-term scenarios like you described.

Unfortunately, I don't see any traction in regard to mobile app shortcuts and thus suggest at least a long-term solution when you shuffle out notes once at PC.

I haven't realized that such a use case exists really, I'll make it more obvious what use case this design suggests.
Thank you for reviewing!

tekjunkie · 22 December 2024 14:05

Just a thought: I am reading this after posting a message to the group about the same problem. I don't like the idea, but at least it should be possible,
The idea to use google keep widget as this has all the functionality(voice,text,photo,drawing) and get these into joplin via the API.
Maybe a possible step to get something useable,
Now I use Google Keep on the phone and the desktop to move stuff into Joplin.

I just saw this in this group and now think this is interesting; maybe I am going wrong.
keep script
"[SOLVED] Importing from Google Keep - #31 by pluraldon"

This was the github
"GitHub - djsudduth/keep-it-markdown: Convert Google Keep notes dynamically to markdown for Obsidian, Logseq, Joplin and Notion using the unofficial Keep API. Also, import simple markdown notes back into Google Keep."

executed · 12 February 2025 01:56

Hey, thanks for the idea!

Yes, as far as I recall, Google Keep sounds like a reasonable way to go.

The only thing that bothers me is that it's Big Tech, no privacy, and yada-yada. At the end of the day, it was Google Keep from which I moved to Joplin, so who am I to go back?

Anyway, I'm abandoning this topic because I found an amazing open-source task management tool called Vikunja.
I think having a voice-to-text shortcut dispatched from my phone directly to Vikunja’s task inbox over API would be much better.

Joplin has TODO capabilities, but it's obviously not designed as a task management tool. For me, as for many others, Joplin is a documentation tool that helps maintain well-structured ideas or even less structured pieces of ideas that are intended to evolve into a bigger picture.
Definitely not well suited for some streams of thought, or even for something simple like keeping track of where I parked my car, as mentioned by previous commenters.

With Vikunja as my task management tool, I can:

Be reminded about something I dictated to my phone (or watch xD) automatically
Leave it for my morning task review
Remind myself to add/adjust a thought in a given Joplin note
Automatically format and enhance the dictated thought using AI

And when the day comes and a task evolves into a bigger idea that needs proper documentation—that’s when Joplin comes into play.

There are no official integrations between these two projects, but at least I can reference Joplin notes in Vikunja tasks via links and vice versa.
But honestly, I feel like I'm becoming evangelical about this approach and will pray until a more tightly integrated solution between Vikunja and Joplin emerges—one that will show the world the power of privacy-oriented open source.

tekjunkie · 12 February 2025 14:30

The only thing that bothers me is that it's Big Tech, no privacy, and yada-yada. At the end of the day, it was Google Keep from which I moved to Joplin, so who am I to go back?

yeah that was what I mean.
good luck

former_evernotist · 14 February 2025 00:05

I agree, although Joplin can be used in very varied ways. What I miss in terms of task management is cross-device reminders - which will probably never be implemented because they require a server-centric architecture. When I set a reminder on desktop or iPad late at night, my iPhone won't instantly remind me the next morning because server-side push notifications aren't possible and also because there is no background sync for Joplin on mobile.

I'll have a look on Vikunja as I'm not entirely satisfied with my current setup: Joplin for notes, iOS Reminders for tasks (synced with CalDAV on my Nextcloud, not with iCloud). Like this I'm missing out on many iCloud features, but preserve my privacy by using open protocols and a trustworthy server. Clients like Thunderbird can also connect to CalDAV tasks. The only thing that's bothering me is that in my workflow there is no strict separation of notes and tasks - which is why I have over 500 reminders, mostly with no definite date, and find it too time-consuming to file them in into Joplin.

Maybe the kind of integration you're suggesting (Joplin and Vikunja) can be a way forward.

executed · 14 February 2025 00:30

This ideally describes the reason I suggest separate task management tool, especially for voice to note reminders.

As alternative to maybe improve your existing task management workflow and maybe make separation between notes and TODOs clearer I suggest taking a look at the following plugin:

former_evernotist · 14 February 2025 23:54

That's just the point: Plugin usage on Joplin for mobile (especially on iOS) has been evolving since a year or two, but is far from consistent and user-friendly in its current state. I suspect that many of the users here on this forum advocating Joplin as a tool for advanced task management (e.g. GTD or PARA) use the desktop version as their daily driver. Joplin has never been a 'mobile first' app, but is still a great example of cross-platform compatibility, so I think mobile usability has to improve (robust reminders, swiping gestures and the like).

Topic		Replies	Views
Voice notes on iOS app Features	1	76	24 March 2025
Feature request OR question: Automation (shortcuts) on iOS Features	3	120	14 November 2024
Quick way to record voice note on Android Support	5	194	25 January 2025
Voice memos to Joplin text notes Features	8	2758	14 November 2024
Recording voice notes and storing them Features	9	1807	14 November 2024