WhisperLive
PaddleSpeech
WhisperLive | PaddleSpeech | |
---|---|---|
4 | 6 | |
1,253 | 10,186 | |
17.0% | 2.2% | |
9.4 | 6.8 | |
8 days ago | 27 days ago | |
Python | Python | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
WhisperLive
-
Show HN: WhisperFusion β Ultra-low latency conversations with an AI chatbot
Everything runs locally, we use:
- WhisperLive for the transcription - https://github.com/collabora/WhisperLive
-
WhisperSpeech β An Open Source text-to-speech system built by inverting Whisper
Check out WhisperLive: https://github.com/collabora/WhisperLive
If you're grappling with the slow march from cool tech demos to real-world language model apps, you might wanna check out WhisperLive. It's this rad open-source project thatβs all about leveraging Whisper models for slick live transcription. Think real-time, on-the-fly translated captions for those global meetups. It's a neat example of practical, user-focused tech in action. Dive into the details on their GitHub page
-
Whisper: Nvidia RTX 4090 vs. M1 Pro with MLX
https://github.com/collabora/WhisperLive
The is another one that uses huggingface's implementation, but I haven't tried it since my spec doesn't support flash-att2
-
Triple Threat: The Power of Transcription, Summary, and Translation
Curious to see how this works? Check out our demo page - https://col.la/transcription to generate your own transcription, summary, and translation, or use our browser extension - https://github.com/collabora/WhisperLive to get live transcriptions.
PaddleSpeech
-
Open Source Libraries
PaddlePaddle/PaddleSpeech
- I made Lisa-nee TTS (Imai Lisa)
- project
-
is there addon that recognize speech (from video) into text?
I couldn't find any add-ons that did what you needed. I'm sorry. Maybe you could try using PaddleSpeech to see if it works for you, but it is not a Firefox add-on, it's a CLI tool.
-
Mozilla Common Voice Adds 16 New Languages and 4,600 New Hours of Speech
Ah, damn. Didn't realise.
It also looks like Baidu are now developing their Deep Speech as open source? https://github.com/PaddlePaddle/DeepSpeech
- Server-Side Audio Transcription Software
What are some alternatives?
cog-whisper-diarization - Cog implementation of transcribing + diarization pipeline with Whisper & Pyannote
TTS - πΈπ¬ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
whisper-writer - π¬π A small dictation app using OpenAI's Whisper speech recognition model.
DeepSpeech - DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
obs-zoom-and-follow - Dynamic zoom and mouse tracking script for OBS Studio
NeMo - A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
gpt_chatbot - This chatbot lets you use your microphone to communicate with GPT-4. It uses the OpenAI text to speech to respond with a voice. It uses Pinecone to store long term information and retrieves it to create context. API keys for OpenAI and Pinecone required. Tested on Windows
TensorVox - Desktop application for neural speech synthesis written in C++
whisper_streaming - Whisper realtime streaming for long speech-to-text transcription and translation
common-voice-android - Repository of "CV Project" app. It's an unofficial app for Mozilla Common Voice, which permits you to contribute to this project via your device.
gpt-voice-conversation-chatbot - Allows you to have an engaging and safely emotive spoken / CLI conversation with the AI ChatGPT / GPT-4 while giving you the option to let it remember things discussed.
TTS - :robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)