Distil-Whisper: distilled version of Whisper that is 6 times faster, 49% smaller

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  1. whisper-turbo

    Cross-Platform, GPU Accelerated Whisper 🏎️

    I'll be shipping Distil-Whisper to whisper-turbo tomorrow! https://github.com/FL33TW00D/whisper-turbo

    Should make running in the browser feasible even for underpowered devices:

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. CTranslate2

    Fast inference engine for Transformer models

    Just a point of clarification - faster-whisper references it but ctranslate2[0] is what's really doing the magic here.

    Ctranslate2 is a sleeper powerhouse project that enables a lot. They should be up front and center and get the credit they deserve.

    [0] - https://github.com/OpenNMT/CTranslate2

  4. distil-whisper

    Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

  5. whisperX

    WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

    How much faster in real wall-clock time is this in batched data than https://github.com/m-bain/whisperX ?

  6. project-2501

    Discontinued Project 2501 is an open-source AI assistant, written in C++. [GET https://api.github.com/repos/Ono-Sendai/project-2501: 404 - Not Found // See: https://docs.github.com/rest/repos/repos#get-a-repository]

    I have something pretty rudimentary here: https://github.com/Ono-Sendai/project-2501

  7. willow

    Open source, local, and self-hosted Amazon Echo/Google Home competitive Voice Assistant alternative

    I'm the founder of Willow[0] (we use ctranslate2 as well) and I will be looking at this as soon tomorrow as these models are released. HF claims they're drop-in compatible but we won't know for sure until someone looks at it.

    [0] - https://heywillow.io/

  8. openWakeWord

    An open-source audio wake word (or phrase) detection framework with a focus on performance and simplicity.

    There's also OpenWakeWord[0]. The models are readily available in tflite and ONNX formats and are impressively "light" in terms of compute requirements and performance.

    It should be possible.

    [0] - https://github.com/dscripka/openWakeWord

  9. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  10. streaming-llm

    [ICLR 2024] Efficient Streaming Language Models with Attention Sinks

    Oh yes, that's absolutely true - faster is better for everyone. It's just that this particular breakpoint would put realtime transcription on a $17 device with an amazing support ecosystem. It's wild.

    That being said, even with this distillation there's still the aspect that Whisper isn't really designed for streaming. It's fairly simplistic and always deals with 30 second windows. I was expecting there to have been some sort of useful transform you could do to the model to avoid quite so much reprocessing per frame, but other than https://github.com/mit-han-lab/streaming-llm (which I'm not even sure directly helps) I haven't noticed anything out there.

  11. whisper.cpp

    Port of OpenAI's Whisper model in C/C++

  12. faster-whisper

    Faster Whisper transcription with CTranslate2

    That's the implication. If the distil models are same format as original openai models then the Distil models can be converted for faster-whisper use as per the conversion instructions on https://github.com/guillaumekln/faster-whisper/

    So then we'll see whether we get the 6x model speedup on top of the stated 4x faster-whisper code speedup.

  13. WhisperInput

    Offline voice input panel & keyboard with punctuation for Android.

    Fortunately yes, recently i've been playing with this github.com/rpdrewes/whisper-websocket-server which uses K6nele as frontend on android if you really care about performance.

    Tho if you're looking for a standalone app then you can give this a go : https://github.com/alex-vt/WhisperInput and run it right on your phone :]

    For now they both run regular openai whisper thus tiny.en but as you can see there's tons of impromvement potential with faster-whisper and now distill-whisper :D

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Amazon Is Discontinuing the "Do Not Send Voice Recordings" Feature on Echo

    3 projects | news.ycombinator.com | 16 Mar 2025
  • Now I Can Just Print That Video

    5 projects | news.ycombinator.com | 4 Dec 2023
  • Whisper Turbo: transcribe 20x faster than realtime using Rust and WebGPU

    3 projects | news.ycombinator.com | 12 Sep 2023
  • Whisper.api: An open source, self-hosted speech-to-text with fast transcription

    5 projects | news.ycombinator.com | 22 Aug 2023
  • LeMUR: LLMs for Audio and Speech

    1 project | news.ycombinator.com | 27 Jul 2023

Did you know that Python is
the 2nd most popular programming language
based on number of references?