Nerd-dictation, hackable speech to text on Linux

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

vosk-api

59 7,025 5.9 Jupyter Notebook

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

Yeah, I was really impressed with the project when I encountered it last year when trying out a bunch of FLOSS Speech-To-Text options.
It was significantly better than the other FLOSS options I looked at--both in terms of getting it going initially & the quality of the speech to text results.
I tested it with a lightly modified version of this example script: https://github.com/alphacep/vosk-api/blob/master/python/exam...
What I found particularly interesting was when you have the "partial" recognition output shown in real-time you get to see how--at the end of a sentence--it may change a word earlier in the sentence in the final recognition output based on (I guess) the additional context of the full sentence.
(I just did a quick test again (with the installs from my testing last year) using an internal laptop microphone & the test script recognized a significant chunk of my speech (using a headset definitely improves things though) whereas with the same environment a test with `mic_vad_streaming` (from `DeepSpeech-examples-r0.9` with `deepspeech-0.9.0-models.pbmm`) failed to recognize any words at all.)

nerd-dictation

28 1,158 3.6 Python

Simple, hackable offline speech to text - using the VOSK-API.
InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
dicio-android

26 603 6.8 Kotlin

Dicio assistant app for Android
Kaldi Speech Recognition Toolkit

22 13,706 7.4 Shell

kaldi-asr/kaldi is the official location of the Kaldi project.

Vosk-api isn't an SST engine itself, it is built using the Kaldi speech recognition toolkit (https://github.com/kaldi-asr/kaldi) and nicely implements and packages an API for Kaldi chain/LF-MMI models.

larynx

18 788 0.0 Python

Discontinued End to end text to speech system using gruut and onnx

Yes!
The project is called Larynx, and it is amazing: https://github.com/rhasspy/larynx/
I waxed lyrical about it recently in this thread about private alternatives to Alexa: https://news.ycombinator.com/item?id=29562526
I can only vouch for the quality/variety in English but it does note support for 50 voices over 9 languages, including all the first group of languages you mentioned, and also Russian. (I've "played" with all those languages to test them but can't really vouch for how a native speaker/listener might find it. :D )
It is miles ahead of any of the other Free/Open Source TTS solutions I've tried, including the ones you mentioned.
(It's still synthesized speech but the output quality is so good and the project is still extremely early days.)
And there's a range of options in accent & gender--which are in general sorely lacking in other FLOSS TTS options. (In terms of licensing, some voices are licensed more freely than others but the majority are without significant restriction.)
I like Larynx so much that I've been working on an editor for it to assist in "auditioning" & recording speech in a narrative context, e.g. game/film pre-viz.

PeerTube

409 12,555 9.9 TypeScript

ActivityPub-federated video streaming platform using P2P directly in your web browser

I just checked and apparently they are already aware: https://github.com/Chocobozzz/PeerTube/issues/3325#issuecomm... :)
(Tho I'll admit I have no idea what "bluffing" means in that context. :D )

recasepunc

1 119 0.0 Python

Model for recasing and repunctuating ASR transcripts

Was just about to mention this repo to the OP but suspect I found it from your site in the first place: https://github.com/benob/recasepunc :D
Punctuation/capitalization will make a massive difference to practical use! Look forward to it.

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project