Show HN: WhisperFusion – Ultra-low latency conversations with an AI chatbot

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

WhisperFusion

3 1,379 8.7 Python

WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.
moondream

3 3,607 9.0 Jupyter Notebook

tiny vision language model

Automatically take a screenshot and feed it to https://github.com/vikhyat/moondream or similar? Doable. But while very impressive, the results are a bit of mixed bag (some hallucinations)

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
llm-companion

2 23 6.7 JavaScript

Mobile web app for audio "push-to-talk" + TTS chat interface with OpenAI-like APIs

Oh this is neat! I was wondering how to get whisper to stream-transcribe well. I have a similar project using whisper + styletts with the similar goal to gave minimal delay: https://github.com/lxe/llm-companion

vocode-python

9 2,287 9.1 Python

🤖 Build voice-based LLM agents. Modular + open source.
WhisperLive

4 1,180 9.4 Python

A nearly-live implementation of OpenAI's Whisper.

Everything runs locally, we use:
- WhisperLive for the transcription - https://github.com/collabora/WhisperLive

WhisperSpeech

5 3,329 9.2 Jupyter Notebook

An Open Source text-to-speech system built by inverting Whisper.

- WhisperSpeech for the text-to-speech - https://github.com/collabora/WhisperSpeech
and an LLM (phi-2, Mistral, etc.) in between

returnn-experiments

2 152 6.4 Python

experiments with RETURNN

The code is all released already. You find it here: https://github.com/rwth-i6/returnn-experiments/tree/master/2...
This is TensorFlow-based. But I also have another PyTorch-based implementation already, also public (inside our other repo, i6_experiments). It's not so easy currently to set this up, but I'm working on a simpler pipeline in PyTorch.
We don't have the models online yet, but we can upload them later. But I'm not sure how useful they are outside of research, as they are specifically for those research tasks (Librispeech, Tedlium), and probably don't perform too well on other data.

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project