DeepSpeech 60x Smaller, 9x faster, and 2x accuracy

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

speech-to-text-benchmark

5 586 3.8 Python

speech to text benchmark framework

The Mozilla DeepSpeech tests on LibreSpeech listed in your link were out of date back in 2020[1], and Coqui.ai (the continuation of Mozilla DeepSpeech) isn't even benchmarked.
https://github.com/Picovoice/speech-to-text-benchmark/issues...

leopard

15 406 8.6 Python

On-device speech-to-text engine powered by deep learning
WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
STT-examples

5 111 0.0 Python

🐸STT integration examples

I will add https://github.com/coqui-ai/STT, which is a continuation of DeepSpeech. Also, I've been messing around with https://github.com/ideasman42/nerd-dictation, which works on a VOSK backend - accuracy is decent, especially with the bigger model.

nerd-dictation

28 1,158 3.6 Python

Simple, hackable offline speech to text - using the VOSK-API.

I will add https://github.com/coqui-ai/STT, which is a continuation of DeepSpeech. Also, I've been messing around with https://github.com/ideasman42/nerd-dictation, which works on a VOSK backend - accuracy is decent, especially with the bigger model.

vosk-api

59 7,057 6.6 Jupyter Notebook

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Making a Podcast Transcription Server with Express.js (source code in comments)
2 projects | /r/javascript | 19 May 2022
VOSK Offline Speech Recognition API
1 project | news.ycombinator.com | 13 Apr 2024
Show HN: AI Dub Tool I Made to Watch Foreign Language Videos with My 7-Year-Old
1 project | news.ycombinator.com | 28 Feb 2024
Weird A.I. Yankovic, a cursed deep dive into the world of voice cloning
4 projects | news.ycombinator.com | 2 Oct 2023
Apollo dev posts backend code to Git to disprove Reddit’s claims of scrapping and inefficiency
4 projects | /r/webdev | 9 Jun 2023

DeepSpeech 60x Smaller, 9x faster, and 2x accuracy

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
speech-to-text voice-recognition Deep Learning speech-recognition Asr
Post date: 9 Mar 2022

speech-to-text-benchmark

leopard

WorkOS

STT-examples

nerd-dictation

vosk-api

InfluxDB

Related posts

DeepSpeech 60x Smaller, 9x faster, and 2x accuracy

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com speech-to-text voice-recognition Deep Learning speech-recognition Asr Post date: 9 Mar 2022

speech-to-text-benchmark

leopard

WorkOS

STT-examples

nerd-dictation

vosk-api

InfluxDB

Related posts

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
speech-to-text voice-recognition Deep Learning speech-recognition Asr
Post date: 9 Mar 2022