Sorry for the late reply and thanks for the links.
For me the desktop remains the work platform of choice (power and screen size) so not considering phone apps, aside from maybe, note intake into Joplin.
So I have done limited development for the Linux desktop in the context of this thread:
Blurt (GitHub - QuantiusBenignus/blurt: Gnome shell extension for accurate speech to text input in Linux using whisper.cpp. Input text from speech into any window that has the keyboard focus.) is a simple Gnome shell extension evolved from the command line utility NoteWhispers, which itself, is built around the great whisper.cpp.
Whisper.cpp has become a standard tool in my Linux workflow, initially mostly for Joplin note taking, but now, thanks to this extension, in every application with editable text field. Wanted to avoid simulating input events (frowned upon for good reasons), so one has to still use the middle mouse button to paste the transcribed text from the clipboard.
The base whisper model is used by default with 30x-faster-than-realtime transcription (with CUDA GPU support), resulting in about 300ms transcription for 10s speech on an average machine with a new(ish) CPU. If you use GNOME on Linux you can give it a try on GitHub or at Blurt - GNOME Shell Extensions .