Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more β
Top 23 stt Open-Source Projects
-
vosk-api
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
-
silero-models
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
STT
πΈSTT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
-
react-transcript-editor
A React component to make correcting automated transcriptions of audio and video easier and faster. By BBC News Labs. - Work in progress
-
TTS-Voice-Wizard
Speech to Text to Speech. Song now playing. Sends text as OSC messages to VRChat to display on avatar. (STTTS) (Speech to TTS) (VRC STT System) (VTuber TTS)
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
dsnote
Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.
-
vosk-browser
A speech recognition library running in the browser thanks to a WebAssembly build of Vosk
-
LangHelper
Striving to create a great Application with full functions of learning languages by ChatGPT, TTS, STT and other awesome AI models, supports talking, speaking assessment, memorizing words with contexts, Listening test, so on.
-
Voice Overlay
π£ An overlay that gets your userβs voice permission and input as text in a customizable UI
-
home-assistant-assist-desktop
Use Home Assistant Assist on the desktop. Compatible with Windows, MacOS, and Linux
-
DiscordEarsBot
A speech-to-text framework and bot for Discord. Take control of your Discord server using speech and voice commands. Can also be useful for hearing impaired and deaf people. (by inevolin)
-
werpy
ππ¦ Rapidly calculate and analyze the Word Error Rate (WER) with this powerful yet lightweight Python package.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Project mention: Weird A.I. Yankovic, a cursed deep dive into the world of voice cloning | news.ycombinator.com | 2023-10-02I doubt it's currently actually "the best open source text to speech", but the answer I came up with when throwing a couple of hours at the problem some months ago was "Silero" [0, 1].
Following the "standalone" guide [2], it was pretty trivial to make the model render my sample text in about 100 English "voices" (many of which were similar to each other, and in varying quality). Sampling those, I got about 10 that were pretty "good". And maybe 6 that were the "best ones" (pretty natural, not annoying to listen to).
IIRC the license was free for noncommercial use only. I'm not sure exactly "how open source" they are, but it was simple to install the dependencies and write the basic Python to try it out; I had to write a for loop to try all the voices like I wanted. I ended using something else for the project for other reasons, but this could still be fairly good backup option for some use cases IMO.
[0] https://github.com/snakers4/silero-models#text-to-speech
Project mention: Rest in Peas: The Unrecognized Death of Speech Recognition (2010) | news.ycombinator.com | 2023-05-04What has happened since then? I know Common Voice has come and gone https://en.wikipedia.org/wiki/Common_Voice https://github.com/coqui-ai/STT
And I've seen some neural approaches too
No idea where to look for comparisons though.
Maybe that? https://github.com/VRCWizard/TTS-Voice-Wizard
Project mention: Speech Note: offline Linux app for note taking, reading and translating | news.ycombinator.com | 2023-08-30
Everything we make is accessible via APIs and integrating our Assist via APIs is already possible. Here is an example of an app someone made that runs on Windows, Mac and Linux: https://github.com/timmo001/home-assistant-assist-desktop
I have been looking everywhere and having a lot of difficulties finding a solution so sorry if I am coming to the wrong place I am trying to create a discord bot that can transcript conversations live, I chose vosk because its an offline too, l but I am unsure of how to implement it in a live setting, I've seen it done in python and disc.js but I dunno...so to cover all bases here is what I have so far.
Project mention: Show HN: Alts β 100% free, local, offline voice assistant and speech recognition | news.ycombinator.com | 2024-01-07
Project mention: Speech To Element - embed speech to text into your website with ease | /r/github | 2023-08-25A GitHub star is always appreciated π https://github.com/OvidijusParsiunas/speech-to-element
stt related posts
-
Weird A.I. Yankovic, a cursed deep dive into the world of voice cloning
-
Creating a live transcript bot using Vosk Ai
-
Rest in Peas: The Unrecognized Death of Speech Recognition (2010)
-
Show HN: ChatGPT and 3D Talking Models
-
Numen - FOSS voice control for handsfree computing
-
Hey can anyone else add the text to speech
-
Messing around with a TTS extension
-
A note from our sponsor - InfluxDB
www.influxdata.com | 3 May 2024
Index
What are some of the best open-source stt projects? This list will help you:
Project | Stars | |
---|---|---|
1 | vosk-api | 7,057 |
2 | silero-models | 4,569 |
3 | STT | 2,144 |
4 | cheetah | 555 |
5 | react-transcript-editor | 535 |
6 | TTS-Voice-Wizard | 520 |
7 | leopard | 408 |
8 | dsnote | 332 |
9 | vosk-browser | 326 |
10 | LangHelper | 308 |
11 | whisper.unity | 307 |
12 | vakyansh-models | 267 |
13 | Voice Overlay | 243 |
14 | whisper-obsidian-plugin | 185 |
15 | simple-obs-stt | 99 |
16 | home-assistant-assist-desktop | 81 |
17 | DiscordEarsBot | 64 |
18 | vosk-unity-asr | 52 |
19 | alts | 37 |
20 | Spity-Sense | 19 |
21 | ovos-stt-plugin-vosk | 14 |
22 | werpy | 9 |
23 | speech-to-element | 8 |
Sponsored