Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression. Learn more →
Similar projects and alternatives to DeepSpeech
Kaldi Speech Recognition Toolkit
kaldi-asr/kaldi is the official location of the Kaldi project.
NeMo: a toolkit for conversational AI
Access the most powerful time series database as a service. Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression.
Mycroft Core, the Mycroft Artificial Intelligence platform.
On-device voice assistant platform powered by deep learning
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
A simple mobile app for rhasspy.
Dicio assistant app for Android
Write Clean C++ Code. Always.. Sonar helps you commit clean C++ code every time. With over 550 unique rules to find C++ bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Common Voice is part of Mozilla's initiative to help teach machines how real people speak.
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
🧠 Leon is your open-source personal assistant.
:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts) (by mozilla)
Simple, hackable offline speech to text - using the VOSK-API.
State-of-the-art (ranked #1 Aug 2022) German Speech Recognition in 284 lines of C++. This is a 100% private 100% offline 100% free CLI tool.
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Secret Sauce AI: a coordinated community of tech minded AI enthusiasts
A prototype CLI in Python where a user can collect all of the recordings needed to produce a wakeword
Automated, end-to-end wakeword model maker using the Precise Wakeword Engine
A replayable arcade RTS where you control a vast(ish) army.
On-device wake word detection powered by deep learning
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
DeepSpeech reviews and mentions
Mozilla Launches Responsible AI Challenge
2 projects | news.ycombinator.com | 15 Mar 2023
Mozilla did release DeepSpeech and Firefox Translation (the latter of which they included in Firefox, to offer client-side webpage translations.)
They definitely have fewer resources than OpenAI, and they do not produce SOTA research (their publications have plummeted to 1/year anyway). So the only way for them to make progress is to seek government grants or make challenges like these.
This challenge is unlikely to be profitable for the winning team: the expected value of winnings are likely around $1K when taking into account the probability that another team gets a better rank, but ML research projects are often more expensive (recently, Alpaca spent upwards of $600 on computation alone; and of course pretraining large models is much more expensive). So the main gain will be publicity.
2 projects | reddit.com/r/196 | 25 Sep 2022
Unfortunately, only Chrome supports the technology required to provide this feature (for now). Firefox is working to include it in the browser, but it is a complex feature that requires a lot of development. Mozilla (the company who developed Firefox) actually have a tool called DeepSpeech to use speech-to-text dictation without using the Internet. I don't know if it will help you, but I've done what I could :'(
speech-to-text on Linux?
3 projects | reddit.com/r/linux | 23 Aug 2022
Show HN: State-of-the-Art German Speech Recognition in 284 lines of C++
5 projects | news.ycombinator.com | 10 Aug 2022
I wrote "284 lines of C++" to indicate that this is compact enough for people to actually read and understand the source code. Also, compiling my implementation is super easy and straightforward ... something which can't be said for Kaldi, Vosk, or DeepSpeech.
If you try to read the CTC beam search decoder from Mozilla's DeepSpeech , that alone is about 2000 LOC in multiple files.
If you try to read the pyctcdecode source that is used by HuggingFace , that's 1000+ LOC of Python.
But this implementation is all the client-side, i.e. the entire "native_client" folder hierarchy in DeepSpeech , narrowed down to a mere 284 lines.
Ask HN: Any technical reasons Google Docs can't do voice typing in Firefox?
3 projects | news.ycombinator.com | 7 Aug 2022
IIRC every browser that supports the Web Speech API does so via cloud services. Mozilla being the only major browser maker without it's own cloud services and having slightly fewer phone-home features didn't want to do that. Mozilla has been doing quite a bit of work in the area though (for example https://github.com/mozilla/DeepSpeech), hopefully to enable these features locally in the future.
Really cool text to speech system. (inclusive docker setup)
3 projects | reddit.com/r/selfhosted | 1 Jul 2022
See also deepspeech
What's the biggest missing piece of the puzzle in the self-hosted universe?
24 projects | reddit.com/r/selfhosted | 26 Mar 2022
Because there's surely enough software available, right (i.e. susi.ai, Mycroft, Kalliope, DeepSpeech, leon, Jasper, Vosk or Genie)?
Top Transcription APIs and Open Source Libraries in 2022
2 projects | dev.to | 7 Mar 2022
Built using the end-to-end model architecture pioneered by Baidu, DeepSpeechis a great open-source speech transcription option.
Make your own custom wakeword and other FOSS voice assistant solutions
16 projects | reddit.com/r/selfhosted | 24 Feb 2022
Ask HN: Private Alternatives to Alexa?
13 projects | news.ycombinator.com | 14 Dec 2021
This is more of a "DIY" approach but all the tools are there for FOSS and OSHWA solutions.
Mozilla has DeepSpeech  and, while not as advanced as the stuff from Google or Amazon, my experimentation left me feeling pretty hopeful that it could reliably recognize at least keywords.
The Raspberry Pi is quite capable though you'll probably need some dedicated microphone to reliably catch voice data. I know ReSpeaker  but maybe some off the shelf conference USB microphones would work as well.
A note from our sponsor - InfluxDB
www.influxdata.com | 31 Mar 2023
mozilla/DeepSpeech is an open source project licensed under Mozilla Public License 2.0 which is an OSI approved license.