whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization) (by m-bain)

whisperX Alternatives

Similar projects and alternatives to whisperX

  1. whisper

    388 whisperX VS whisper

    Robust Speech Recognition via Large-Scale Weak Supervision

  2. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  3. Nim

    373 whisperX VS Nim

    Nim is a statically typed compiled systems programming language. It combines successful concepts from mature languages like Python, Ada and Modula. Its design focuses on efficiency, expressiveness, and elegance (in that order of priority).

  4. languagetool

    Style and Grammar Checker for 25+ Languages

  5. TTS

    245 whisperX VS TTS

    πŸΈπŸ’¬ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

  6. whisper.cpp

    Port of OpenAI's Whisper model in C/C++

  7. keyd

    A key remapping daemon for linux.

  8. piper

    57 whisperX VS piper

    Discontinued A fast, local neural text to speech system

  9. faster-whisper

    Faster Whisper transcription with CTranslate2

  10. CTranslate2

    Fast inference engine for Transformer models

  11. stable-ts

    5 whisperX VS stable-ts

    Discontinued Transcription, forced alignment, and audio indexing with OpenAI's Whisper

  12. whisper-asr-webservice

    OpenAI Whisper ASR Webservice API

  13. frogbase

    Discontinued Transform audio-visual content into navigable knowledge.

  14. subsync

    Discontinued Subtitle Speech Synchronizer

  15. WAAS

    12 whisperX VS WAAS

    Whisper as a Service (GUI and API with queuing for OpenAI Whisper)

  16. whisper-turbo

    12 whisperX VS whisper-turbo

    Cross-Platform, GPU Accelerated Whisper 🏎️

  17. transcribe-anything

    Multi-backend whisper app. Blazing fast. Mac-arm optimized. Easy install. Input a local file or url and this service will transcribe it using Whisper AI. Completely private and Free 🀯🀯🀯

  18. whisper-standalone-win

    Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.

  19. insanely-fast-whisper

    2 whisperX VS insanely-fast-whisper

    Incredibly fast Whisper-large-v3 (by chenxwh)

  20. openai-whisper-cpu

    5 whisperX VS openai-whisper-cpu

    Improving transcription performance of OpenAI Whisper for CPU based deployment

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better whisperX alternative or higher similarity.

whisperX discussion

Log in or Post with

whisperX reviews and mentions

Posts with mentions or reviews of whisperX. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2026-04-18.
  • The Unit Economics of Speech-to-Text Just Collapsed
    2 projects | dev.to | 18 Apr 2026
    Look at what arrived between mid-2023 and mid-2025. Gandhi et al.'s Distil-Whisper (2023) distilled large-v2 into a 756M-param student that runs 6Γ— faster with a 1% WER gap on out-of-distribution audio, using large-scale pseudo-labelling. Georgi Gerganov's whisper.cpp made CPU-only and mobile inference a default rather than a party trick; a base.en checkpoint transcribes real-time on an M1 without touching a GPU. Max Bain's WhisperX added forced-alignment and diarization on top, so word-level timestamps and speaker labels stopped being a premium-tier differentiator.
  • Dell's CES 2026 chat was the most pleasingly un-AI briefing I've had in 5 years
    3 projects | news.ycombinator.com | 8 Jan 2026
    Yes, if you check their community integrations section on faster-whisper [0], you can see a lot of different CLIs, GUIs, and libraries. I recommend WhisperX [1], it's the most complete CLI so far and has features like diarization which whisper.cpp does not have in a production-ready capacity.

    [0] https://github.com/SYSTRAN/faster-whisper#community-integrat...

    [1] https://github.com/m-bain/whisperX

  • A beginner's guide to the Whisperx-A40-Large model by Victor-Upmeet on Replicate
    1 project | dev.to | 4 Jan 2026
    The whisperx-a40-large model is an accelerated version of the popular Whisper automatic speech recognition (ASR) model. Developed by Victor Upmeet, it provides fast transcription with word-level timestamps and speaker diarization. This model builds upon the capabilities of Whisper, which was originally created by OpenAI, and incorporates optimizations from the WhisperX project for improved performance.
  • Go Away Python
    6 projects | news.ycombinator.com | 30 Dec 2025
    I've actually done a fair bit of ML work in Elixir, in practice I found:

    1) It's generally harder to interface with existing libraries and models (example: whisperX [0] is a library that combines generic whisper speech recognition models with some additional tools like discrete-time-warping to create a transcription with more accurate time stamp alignment - something that was very helpful when generating subtitles. But because most of this logic just lives in the python library, using this in Elixir requires writing a lot more tooling around the existing bumblebee whisper implementation [1]).

    but,

    2) It's way easier to ship models I built and trained entirely with Elixir's ML ecosystem - EXLA, NX, Bumblebee. I trained a few models doing basic visual recognition tasks (detecting scene transitions, credits, title cards, etc), using the existing CLIP model as a visual frontend and then training a small classifier on the output of CLIP. It was pretty straightforward to do with Elixir, and I love that I can run the same exact code on my laptop and server without dealing with lots of dependencies and environment issues.

    Livebook is also incredibly nice, my typical workflow has become prototyping things in Livebook with some custom visualization tools that I made and then just connecting to a livebook instance running on EC2 to do the actual training run. From there shipping and using the model is seamless, and I just publish the wrapping module as a library on our corporate github, which lets anyone else import it straight into livebook and use it.

    [0] https://github.com/m-bain/whisperX

    [1] https://hexdocs.pm/bumblebee/Bumblebee.Audio.Whisper.html

  • Making AI Models Faster, Cheaper, and Greener β€” Here’s How
    5 projects | dev.to | 3 Nov 2025
    2.3X speed improvement over WhisperX and a 3X speed boost compared to HuggingFace Pipeline with FlashAttention 2 (Insanely Fast Whisper)
  • FFmpeg 8.0 adds Whisper support
    10 projects | news.ycombinator.com | 13 Aug 2025
  • Ask HN: What Speaker Diarization tools should I look into?
    1 project | news.ycombinator.com | 23 Jul 2025
    I am building VideoToBe.com - I have found that whisperX works the most reliable.

    https://github.com/m-bain/whisperX

    It is built on top of OpenAI Whisper, so speech recognition is good, the transcript gives speaker tags as 'SPEAKER_00' and 'SPEAKER_01' etc.

    Here is how the transcript may look like

    https://videotobe.com/play/media/1b02f75a-9503-43aa-8956-d18...

  • Ask HN: What API or software are people using for transcription?
    10 projects | news.ycombinator.com | 9 Jun 2025
    I use whisperfile[1] directly. The whisper-large-v3 model seems good with non-English transcription, which is my main use-case.

    I am also eyeing whisperX[2], because I want to play some more with speaker diarization.

    Your use-case seems to be batch transcription, so I'd suggest you go ahead and just use whisperfile, it should work well on an M4 mini, and it also has an HTTP API if you just start it without arguments.

    If you want more interactivity, I have been using Vibe[3] as an open-source replacement of SuperWhisper[4], but VoiceInk from a sibling comment seems better.

    Aside: It seems that so many of the mentioned projects use whisper at the core, that it would be interesting to explicitly mark the projects that don't use whisper, so we can have a real fundamental comparison.

    [1] https://huggingface.co/Mozilla/whisperfile

    [2] https://github.com/m-bain/whisperX

    [3] https://github.com/thewh1teagle/vibe/

    [4] https://superwhisper.com/

  • Ask HN: Is Whisper Still Relevant?
    2 projects | news.ycombinator.com | 12 Feb 2025
    Yes it's still relevant but I prefer WhisperX for some tasks: https://github.com/m-bain/whisperX
  • Show HN: Mikey – No bot meeting notetaker for Windows
    6 projects | news.ycombinator.com | 12 Feb 2025
    https://github.com/m-bain/whisperX looks promising - I'm hacking away on an always-on transcriber for my notes for later search&recall. It has support for diarization (the speaker detection you're looking for).

    I'm currently hacking away on a mix of https://github.com/speaches-ai/speaches + https://github.com/ufal/whisper_streaming though - mostly because my laptop doesn't have a decent GPU, I stream the audio to a home server instead.

    But overall it's pretty simple to do after you wrangle the Python dependencies - all you need is a sink for the text files (for example, create a new file for every Teams meeting, but that's another story...)

  • A note from our sponsor - SaaSHub
    www.saashub.com | 11 Jun 2026
    SaaSHub helps you find the best software and product alternatives Learn more β†’

Stats

Basic whisperX repo stats
41
22,295
8.3
9 days ago

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com

Did you know that Python is
the 1st most popular programming language
based on number of references?