Our great sponsors
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
vosk-api
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
The Mozilla DeepSpeech tests on LibreSpeech listed in your link were out of date back in 2020[1], and Coqui.ai (the continuation of Mozilla DeepSpeech) isn't even benchmarked.
https://github.com/Picovoice/speech-to-text-benchmark/issues...
I will add https://github.com/coqui-ai/STT, which is a continuation of DeepSpeech. Also, I've been messing around with https://github.com/ideasman42/nerd-dictation, which works on a VOSK backend - accuracy is decent, especially with the bigger model.
I will add https://github.com/coqui-ai/STT, which is a continuation of DeepSpeech. Also, I've been messing around with https://github.com/ideasman42/nerd-dictation, which works on a VOSK backend - accuracy is decent, especially with the bigger model.
Related posts
- Making a Podcast Transcription Server with Express.js (source code in comments)
- VOSK Offline Speech Recognition API
- Show HN: AI Dub Tool I Made to Watch Foreign Language Videos with My 7-Year-Old
- Weird A.I. Yankovic, a cursed deep dive into the world of voice cloning
- Apollo dev posts backend code to Git to disprove Reddit’s claims of scrapping and inefficiency