WhisperLive
vocode-python
WhisperLive | vocode-python | |
---|---|---|
4 | 9 | |
1,253 | 2,330 | |
17.0% | 4.8% | |
9.4 | 9.1 | |
8 days ago | 9 days ago | |
Python | Python | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
WhisperLive
-
Show HN: WhisperFusion – Ultra-low latency conversations with an AI chatbot
Everything runs locally, we use:
- WhisperLive for the transcription - https://github.com/collabora/WhisperLive
-
WhisperSpeech – An Open Source text-to-speech system built by inverting Whisper
Check out WhisperLive: https://github.com/collabora/WhisperLive
If you're grappling with the slow march from cool tech demos to real-world language model apps, you might wanna check out WhisperLive. It's this rad open-source project that’s all about leveraging Whisper models for slick live transcription. Think real-time, on-the-fly translated captions for those global meetups. It's a neat example of practical, user-focused tech in action. Dive into the details on their GitHub page
-
Whisper: Nvidia RTX 4090 vs. M1 Pro with MLX
https://github.com/collabora/WhisperLive
The is another one that uses huggingface's implementation, but I haven't tried it since my spec doesn't support flash-att2
-
Triple Threat: The Power of Transcription, Summary, and Translation
Curious to see how this works? Check out our demo page - https://col.la/transcription to generate your own transcription, summary, and translation, or use our browser extension - https://github.com/collabora/WhisperLive to get live transcriptions.
vocode-python
- Launch HN: Retell AI (YC W24) – Conversational Speech API for Your LLM
-
Ask HN: Who is hiring? (February 2024)
Vocode || Engineering (multiple roles) || SF/Remote || Full-time/Contract || https://vocode.dev
- Show HN: WhisperFusion – Ultra-low latency conversations with an AI chatbot
-
April 2023
Vocode–an open source library for building LLM applications you can talk to. (https://github.com/vocodedev/vocode-python)
-
Serverless voice chat with Vicuna-13B
Coqui also looks interesting.
https://github.com/coqui-ai/TTS
Support for it was recently added to vocode:
https://github.com/vocodedev/vocode-python/pull/56
-
Vocode is an open source library that makes it easy to build voice-based LLM apps
Direct link to the code: https://github.com/vocodedev/vocode-python
-
Show HN: Vocode (YC W23) Is Back with an April Fools Special – PrankGPT
Hey everyone! We are so grateful for the warm reception from our launch this week.
We're back with PrankGPT (origin story of Vocode), rebuilt using our library https://github.com/vocodedev/vocode-python
Source code for the backend is public and available on replit to check it out
-
Show HN: YakGPT – A locally running, hands-free ChatGPT UI
Given that Vocode (realtime audio, llm, etc) came out a few days ago, could you speak to how yours compares to it?
https://github.com/vocodedev/vocode-python
-
Gen Z GPT hotline demo
It's a demo for their new open source library integrating several AI tools: https://github.com/vocodedev/vocode-python
What are some alternatives?
cog-whisper-diarization - Cog implementation of transcribing + diarization pipeline with Whisper & Pyannote
bark - 🔊 Text-Prompted Generative Audio Model
whisper-writer - 💬📝 A small dictation app using OpenAI's Whisper speech recognition model.
Flowise - Drag & drop UI to build your customized LLM flow
obs-zoom-and-follow - Dynamic zoom and mouse tracking script for OBS Studio
PentestGPT - A GPT-empowered penetration testing tool
gpt_chatbot - This chatbot lets you use your microphone to communicate with GPT-4. It uses the OpenAI text to speech to respond with a voice. It uses Pinecone to store long term information and retrieves it to create context. API keys for OpenAI and Pinecone required. Tested on Windows
ChatGPT-Next-Web - A cross-platform ChatGPT/Gemini UI (Web / PWA / Linux / Win / MacOS). 一键拥有你自己的跨平台 ChatGPT/Gemini 应用。
whisper_streaming - Whisper realtime streaming for long speech-to-text transcription and translation
textSQL
gpt-voice-conversation-chatbot - Allows you to have an engaging and safely emotive spoken / CLI conversation with the AI ChatGPT / GPT-4 while giving you the option to let it remember things discussed.
prompt-engineering - ChatGPT Prompt Engineering for Developers - deeplearning.ai