Top 23 Whisper Open-Source Projects

quivr

22 32,240 9.9 TypeScript

Your GenAI Second Brain 🧠 A personal productivity assistant (RAG) ⚡️🤖 Chat with your docs (PDF, CSV, ...) & apps using Langchain, GPT 3.5 / 4 turbo, Private, Anthropic, VertexAI, Ollama, LLMs, Groq that you can share with users ! Local & Private alternative to OpenAI GPTs & ChatGPT powered by retrieval-augmented generation.

Project mention: privateGPT VS quivr - a user suggested alternative | libhunt.com/r/privateGPT | 2024-01-12

whisper.cpp

187 30,942 9.8 C

Port of OpenAI's Whisper model in C/C++

Project mention: Show HN: I created automatic subtitling app to boost short videos | news.ycombinator.com | 2024-04-09

whisper.cpp [1] has a karaoke example that uses ffmpeg's drawtext filter to display rudimentary karaoke-like captions. It also supports diarisation. Perhaps it could be a starting point to create a better script that does what you need.
--
1: https://github.com/ggerganov/whisper.cpp/blob/master/README....

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
PaddleSpeech

6 10,120 7.6 Python

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Project mention: Open Source Libraries | /r/AudioAI | 2023-10-02

PaddlePaddle/PaddleSpeech

buzz

21 9,778 8.6 Python

Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.

Project mention: Buzz: Transcribe and translate audio offline on your personal computer | news.ycombinator.com | 2024-03-21

whisperX

24 8,869 8.7 Python

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Project mention: Easy video transcription and subtitling with Whisper, FFmpeg, and Python | news.ycombinator.com | 2024-04-06

It uses this, which does support diarization: https://github.com/m-bain/whisperX

faster-whisper

22 8,723 8.3 Python

Faster Whisper transcription with CTranslate2

Project mention: Using Groq to Build a Real-Time Language Translation App | dev.to | 2024-04-05

For our real-time STT needs, we'll employ a fantastic library called faster-whisper.

embark-framework

4 3,775 0.0 JavaScript

Framework for serverless Decentralized Applications using Ethereum, IPFS and other platforms
WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
cheetah

9 3,781 5.0 Swift

Mac app for crushing remote tech interviews with AI

Project mention: Has anyone got into Big Tech through cheating ? | /r/leetcode | 2023-12-03

Has anyone been able to get into big tech via AI or some other way without doing leetcode?

FunASR

2 3,110 9.9 Python

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models. ｜语音识别工具包，包含丰富的性能优越的开源预训练模型，支持语音识别、语音端点检测、文本后处理等，具备服务部署能力。

Project mention: FunASR: Fundamental End-to-End Speech Recognition Toolkit | news.ycombinator.com | 2024-01-13

distil-whisper

9 3,125 8.5 Python

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

Project mention: FLaNK Stack 05 Feb 2024 | dev.to | 2024-02-05

openai

2 2,713 8.8 C#

OpenAI .NET sdk - ChatGPT, Whisper, GPT-3, GPT-4, Azure OpenAI and DALL-E
chatgpt-telegram-bot

3 2,686 8.8 Python

🤖 A Telegram bot that integrates with OpenAI's official ChatGPT APIs to provide answers, written in Python (by n3d1117)

Project mention: Are you selfhosting a ChatGPT alternative? | /r/selfhosted | 2023-06-09

inference

2 2,512 9.7 Python

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.

Project mention: GreptimeAI + Xinference - Efficient Deployment and Monitoring of Your LLM Applications | dev.to | 2024-01-24

Xorbits Inference (Xinference) is an open-source platform to streamline the operation and integration of a wide array of AI models. With Xinference, you’re empowered to run inference using any open-source LLMs, embedding models, and multimodal models either in the cloud or on your own premises, and create robust AI-driven applications. It provides a RESTful API compatible with OpenAI API, Python SDK, CLI, and WebUI. Furthermore, it integrates third-party developer tools like LangChain, LlamaIndex, and Dify, facilitating model integration and development.

ruby-openai

11 2,405 9.1 Ruby

OpenAI API + Ruby! 🤖❤️ Now with Assistants, Threads, Messages, Runs and Text to Speech 🍾

Project mention: ruby and ML/AI chatgpt | /r/ruby | 2023-07-07

ruby-openai

willow

37 2,361 9.6 C

Open source, local, and self-hosted Amazon Echo/Google Home competitive Voice Assistant alternative

Project mention: ESPHome | news.ycombinator.com | 2024-04-23

Fair points but with all due respect completely misses the point and context. My comment was a reply to a new user interested in esphome on a post about esphome.
You're talking about CircuitPython, 35KB web replies, PSRAM, UF2 bootloader, etc. These are comparatively very advanced topics and you didn't mention esphome once.
The comfort and familiarity of Amazon for what is already a new, intimidating, and challenging subject is of immeasurable value for a novice. They can click those links, fill a cart, and have stuff show up tomorrow with all of the usual ease, friendliness, and reliability of Amazon. If they get frustrated or it doesn't work out they can shove it in the box and get a full refund Amazon-style.
You're suggesting wandering all over the internet, ordering stuff from China, multiple vendors, etc while describing a bunch of things that frankly just won't matter to them. I say this as someone who has been an esphome and home assistant user since day one. The approach I described has never failed or remotely bothered me and over the past ~decade I've seen it suggested to new users successfully time and time again.
In terms of PSRAM to my knowledge the only thing it is utilized for in the esphome ecosystem is higher resolution displays and more advanced voice assistant scenarios that almost always require -S3 anyway and are a very advanced, challenging use cases. I'm very familiar with displays, voice, the S3, and PSRAM but more on that in a second...
> live with one less LX7 core and no Bluetooth
I'm the founder of Willow[0] and when comparing Willow to esphome the most frequent request we get is supporting bluetooth functionality i.e. esphome bluetooth proxy[1]. This is an extremely popular use case in the esphome/home assistant community. Not having bluetooth while losing a core and paying more is a bigger issue than pin spacing.
It's also a pretty obscure board and while not a big deal to you and I if you look around at docs, guides, etc, etc you'll see the cheap-o boards from Amazon are by far the most popular and common (unsurprisingly). Another plus for a new user.
Speaking of Willow (and back to PSRAM again) even the voice assistant satellite functionality of Home Assistant doesn't fundamentally require it - the most popular device doesn't have it either[2].
Very valuable comment with a lot of interesting information, just doesn't apply to context.
[0] - https://heywillow.io/
[1] - https://esphome.io/components/bluetooth_proxy.html
[2] - https://www.home-assistant.io/voice_control/thirteen-usd-voi...

whisper-diarization

5 1,985 7.2 Jupyter Notebook

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

Project mention: MacWhisper: Transcribe audio files on your Mac | news.ycombinator.com | 2023-08-23

https://github.com/MahmoudAshraf97/whisper-diarization
This project has been alright for transcribing audio with speaker diarization. A big finicky. The OpenAI model is better than other paid products(Descript, Riverside) so I’m looking forward to trying MacWhisper.

whisper-timestamped

2 1,501 8.3 Python

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

Project mention: Show HN: AI Dub Tool I Made to Watch Foreign Language Videos with My 7-Year-Old | news.ycombinator.com | 2024-02-28

Yes. But Whisper's word-level timings are actually quite inaccurate out of the box. There are some Python libraries that mitigate that. I tested several of them. whisper-timestamped seems to be the best one. [0]
[0] https://github.com/linto-ai/whisper-timestamped

yt-whisper

3 1,313 0.0 Python

Using OpenAI's Whisper to automatically generate YouTube subtitles
openai-kotlin

3 1,264 8.7 Kotlin

OpenAI API client for Kotlin with multiplatform and coroutines capabilities.

Project mention: I've made a huge mistake by switching to Kotlin /s | /r/Kotlin | 2023-06-01

Or, can it? https://github.com/aallam/openai-kotlin

auto-subtitle

4 1,164 2.8 Python

Automatically generate and overlay subtitles for any video.
WhisperLive

4 1,143 9.5 Python

A nearly-live implementation of OpenAI's Whisper.

Project mention: Show HN: WhisperFusion – Ultra-low latency conversations with an AI chatbot | news.ycombinator.com | 2024-01-29

Everything runs locally, we use:
- WhisperLive for the transcription - https://github.com/collabora/WhisperLive

subsai

3 1,051 8.1 Python

🎞️ Subtitles generation tool (Web-UI + CLI + Python package) powered by OpenAI's Whisper and its variants 🎞️

Project mention: Porting CP/M to the Brother SuperPowerNote Z80 laptop thing [video] | news.ycombinator.com | 2023-12-13

Adding Whisper subtitles was really easy and they're dramatically better than the automatic Google ones (I did it via https://github.com/abdeladim-s/subsai, which was really easy to use). So there is now a reasonably good transcript available in the video comments.

modelfusion

18 883 9.9 TypeScript

The TypeScript library for building AI applications.

Project mention: Next.js and GPT-4: A Guide to Streaming Generated Content as UI Components | dev.to | 2024-01-25

ModelFusion is an AI integration library that I am developing. It enables you to integrate AI models into your JavaScript and TypeScript applications. You can install it with the following command:

SaaSHub

www.saashub.com sponsored

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Whisper related posts

Show HN: Open-source Google Docs for audio transcriptions (Whisper)
2 projects | news.ycombinator.com | 17 Apr 2024
Show HN: I created automatic subtitling app to boost short videos
1 project | news.ycombinator.com | 9 Apr 2024
Easy video transcription and subtitling with Whisper, FFmpeg, and Python
1 project | news.ycombinator.com | 6 Apr 2024
SOTA ASR Tooling: Long-Form Transcription
1 project | news.ycombinator.com | 31 Mar 2024
Deploying whisperX on AWS SageMaker as Asynchronous Endpoint
2 projects | dev.to | 31 Mar 2024
Buzz: Transcribe and translate audio offline on your personal computer
1 project | news.ycombinator.com | 21 Mar 2024
Voxos.ai – An Open-Source Desktop Voice Assistant
7 projects | news.ycombinator.com | 19 Jan 2024
A note from our sponsor - InfluxDB
www.influxdata.com | 24 Apr 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source Whisper projects? This list will help you:

	Project	Stars
1	quivr	32,240
2	whisper.cpp	30,942
3	PaddleSpeech	10,120
4	buzz	9,778
5	whisperX	8,869
6	faster-whisper	8,723
7	embark-framework	3,775
8	cheetah	3,781
9	FunASR	3,110
10	distil-whisper	3,125
11	openai	2,713
12	chatgpt-telegram-bot	2,686
13	inference	2,512
14	ruby-openai	2,405
15	willow	2,361
16	whisper-diarization	1,985
17	whisper-timestamped	1,501
18	yt-whisper	1,313
19	openai-kotlin	1,264
20	auto-subtitle	1,164
21	WhisperLive	1,143
22	subsai	1,051
23	modelfusion	883