DeepSpeech 60x Smaller, 9x faster, and 2x accuracy

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Stream - Scalable APIs for Chat, Feeds, Moderation, & Video.
Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
getstream.io
featured
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com
featured
  1. speech-to-text-benchmark

    speech to text benchmark framework

    The Mozilla DeepSpeech tests on LibreSpeech listed in your link were out of date back in 2020[1], and Coqui.ai (the continuation of Mozilla DeepSpeech) isn't even benchmarked.

    https://github.com/Picovoice/speech-to-text-benchmark/issues...

  2. Stream

    Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.

    Stream logo
  3. leopard

    On-device speech-to-text engine powered by deep learning

  4. STT-examples

    🐸STT integration examples

    I will add https://github.com/coqui-ai/STT, which is a continuation of DeepSpeech. Also, I've been messing around with https://github.com/ideasman42/nerd-dictation, which works on a VOSK backend - accuracy is decent, especially with the bigger model.

  5. nerd-dictation

    Simple, hackable offline speech to text - using the VOSK-API.

    I will add https://github.com/coqui-ai/STT, which is a continuation of DeepSpeech. Also, I've been messing around with https://github.com/ideasman42/nerd-dictation, which works on a VOSK backend - accuracy is decent, especially with the bigger model.

  6. vosk-api

    Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

  7. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Making a Podcast Transcription Server with Express.js (source code in comments)

    2 projects | /r/javascript | 19 May 2022
  • Infini-Gram: Scaling unbounded n-gram language models to a trillion tokens

    4 projects | news.ycombinator.com | 5 May 2024
  • VOSK Offline Speech Recognition API

    1 project | news.ycombinator.com | 13 Apr 2024
  • Show HN: AI Dub Tool I Made to Watch Foreign Language Videos with My 7-Year-Old

    1 project | news.ycombinator.com | 28 Feb 2024
  • Weird A.I. Yankovic, a cursed deep dive into the world of voice cloning

    4 projects | news.ycombinator.com | 2 Oct 2023

Did you know that Python is
the 2nd most popular programming language
based on number of references?