-
silero-models
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
Silero[0] seems to have decent performance (although you will have to some minimal coding). I believe there are better ones if you're willing to tinker a bit more.
[0]: https://github.com/snakers4/silero-models
-
CodeRabbit
CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
-
vscode-ltex
LTeX: Grammar/spell checker :mag::heavy_check_mark: for VS Code using LanguageTool with support for LaTeX :mortar_board:, Markdown :pencil:, and others
-
The output is more intended for captioning so it's lots of short phrases with timestamps and no punctuation, but it'll give you a quick taste of what Vosk can do.
[1] https://github.com/o-oconnell/mp4grep
-
The Mozilla DeepSpeech spin-off Coqui has an STT that is locally installable:
https://coqui.ai/
Related posts
-
What's the best text-to-speech free non-cloud software?
-
Ask HN: Are there any good open source Text-to-Speech tools?
-
OpenAI deems its voice cloning tool too risky for general release
-
Base TTS (Amazon): The largest text-to-speech model to-date
-
WhisperSpeech – An Open Source text-to-speech system built by inverting Whisper