vosk-server
audapolis
vosk-server | audapolis | |
---|---|---|
4 | 8 | |
843 | 638 | |
1.8% | 2.0% | |
5.5 | 6.7 | |
30 days ago | 7 months ago | |
Python | TypeScript | |
Apache License 2.0 | GNU Affero General Public License v3.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
vosk-server
- Self-hosted audio transcription?
-
Open Source ASR with user-specific custom vocabularies?
Through my research, the most promising real-time transcription options appear to be Vosk or Kaldi Gstreamer. I’ve set them both up & they appear to work well for general transcription, but I’m not sure how to handle the user-specific custom vocabularies.
- Voice2json: Offline speech and intent recognition on Linux
- Connecting vosk python model with react
audapolis
- Audapolis: An editor for spoken-word audio with automatic transcription
-
MacWhisper: Transcribe audio files on your Mac
Here's a multi-platform open source app that does the same thing but uses vosk instead of whisper.
https://github.com/bugbakery/audapolis
- Will Kden ever have Ai
-
Self-hosted audio transcription?
Audapolis is also an interesting option: https://github.com/audapolis/audapolis
- [Looking for] Ai audio denoise & transcript
- Audapolis – Edit audio and video by selecting text
What are some alternatives?
vosk-api - Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
whisper-diarization - Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
common-voice - Common Voice is part of Mozilla's initiative to help teach machines how real people speak.
LLMStack - No-code platform to build LLM Agents, workflows and applications with your data
kaldi-gstreamer-server - Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.
buzz - Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.
TTS - 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
whisperer - On-demand prompt-aided voice-to-text with OpenAI's Whisper
julius - Open-Source Large Vocabulary Continuous Speech Recognition Engine
SpeechRecognition - Speech recognition module for Python, supporting several engines and APIs, online and offline.
vosk-android-demo - Offline speech recognition for Android with Vosk library.
whisper - Robust Speech Recognition via Large-Scale Weak Supervision