Listenr Alternatives
Similar projects and alternatives to listenr
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
mlx-audio
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.
-
transcribe-critic
Multi-source transcript merging inspired by textual criticism β LLM adjudicates multiple Whisper, YouTube captions & external transcripts for higher quality. Includes speaker diarization and summarization.
-
splaa
SPLAA is an AI assistant framework that utilizes voice recognition, text-to-speech, and tool-calling capabilities to provide a conversational and interactive experience. It uses LLMs available through Ollama and has capabilities for extending functionalities through a modular tool system.
-
june
Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkit (by mezbaul-h)
-
-
SenseVoice
Multilingual speech understanding: ASR + emotion recognition + audio event detection. 50+ languages, 15x faster than Whisper, non-autoregressive.
-
-
listenr discussion
listenr reviews and mentions
-
Gemma 4 12B: A unified, encoder-free multimodal model
I use small models like Gemma to improve transcriptions from ASR models amongst other micro-tasks. I actually built out a fine-tuning whisper pipeline with all local (smaller) models meaning no cloud/big-tech co is able to train/sell my (private) data.
Repo is https://github.com/Rebreda/listenr - mainly geared toward Whisper fine-tuning, AMD hardware and local inference
-
My Journey to a reliable and enjoyable locally hosted voice assistant
I've been working on the flip side of this with ASR models, but the problem space is the same, conversational/real-world data is needed. Whisper often mistook actual words I say and hallucinate all the time when speaking technical jargon. The solution is to fine-tuning whisper with my own data. Hardest part imo was getting the actual data, which in turn got me to build listenr (https://github.com/rebreda/listenr).It's an always-on VAD-based audio dataset builder. Could be used for building conversational/real-world voice datasets for TTS models too?
After getting it working i was get motivation to actually able to build out the full fine-tuning pipeline. I wrote a little post about it all https://quickthoughts.ca/posts/listenr-asr-training-data-pro...
Stats
Rebreda/listenr is an open source project licensed under Mozilla Public License 2.0 which is an OSI approved license.
The primary programming language of listenr is Python.