Whisper: Nvidia RTX 4090 vs. M1 Pro with MLX

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  1. mlx-examples

    Examples in the MLX framework

    Does this translate to other models or was whisper cherry picked due to it's serial nature and integer math? looking at https://github.com/ml-explore/mlx-examples/tree/main/stable_... seems to hint that this is the case:

    >At the time of writing this comparison convolutions are still some of the least optimized operations in MLX.

    I think the main thing at play is the fact you can have 64+G of very fast ram directly coupled to the cpu/gpu and the benefits of that from a latency/co-accessibility point of view.

    These numbers are certainly impressive when you look at the power packages of these systems.

    Worth considering/noting that the cost of m3 max system with the minimum ram config is ~2x the price of a 4090...

  2. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  3. insanely-fast-whisper

    How does this compare to insanely-fast-whisper though? https://github.com/Vaibhavs10/insanely-fast-whisper

    I think that not using optimizations allows this to be a 1:1 comparison, but if the optimizations are not ported to MLX, then it would still be better to use a 4090.

    Having looked at MLX recently, I think it's definitely going to get traction on Macs - and iOS when Swift bindings are released https://github.com/ml-explore/mlx/issues/15 (although there might be some C++20 compilation issue blocking right now).

  4. mlx

    MLX: An array framework for Apple silicon

    How does this compare to insanely-fast-whisper though? https://github.com/Vaibhavs10/insanely-fast-whisper

    I think that not using optimizations allows this to be a 1:1 comparison, but if the optimizations are not ported to MLX, then it would still be better to use a 4090.

    Having looked at MLX recently, I think it's definitely going to get traction on Macs - and iOS when Swift bindings are released https://github.com/ml-explore/mlx/issues/15 (although there might be some C++20 compilation issue blocking right now).

  5. faster-whisper

    Faster Whisper transcription with CTranslate2

    Could someone elaborate how is this accomplished and is there any quality disparity compared to original whisper?

    Repos like https://github.com/SYSTRAN/faster-whisper makes immediate sense about why it's faster than the original, but this one, not so much, especially considering it's even much faster.

  6. cog-whisper-diarization

    Cog implementation of transcribing + diarization pipeline with Whisper & Pyannote

    I'll take this opportunity to ask for help: What's a good open source transcription and diarization app or work flow?

    I looked at https://github.com/thomasmol/cog-whisper-diarization and https://about.transcribee.net/ (from the people behind Audapolis) but neither work that well -- crashes, etc.

    Thank you!

  7. WhisperLive

    A nearly-live implementation of OpenAI's Whisper.

    https://github.com/collabora/WhisperLive

    The is another one that uses huggingface's implementation, but I haven't tried it since my spec doesn't support flash-att2

  8. whisper_streaming

    Whisper realtime streaming for long speech-to-text transcription and translation

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • I Self-Hosted Llama 3.2 with Coolify on My Home Server: A Step-by-Step Guide

    2 projects | news.ycombinator.com | 16 Oct 2024
  • Distil-Whisper: distilled version of Whisper that is 6 times faster, 49% smaller

    14 projects | news.ycombinator.com | 31 Oct 2023
  • I built a free, local video transcription tool, because I didn't want to pay $10/hour or upload my files to a stranger's server

    2 projects | dev.to | 9 May 2026
  • Build Real-Time AI Voice Transcription for Web Meetings Fast

    2 projects | dev.to | 24 Mar 2026
  • Nvidia Triton Inference Server

    1 project | news.ycombinator.com | 9 Mar 2026

Did you know that Python is
the 1st most popular programming language
based on number of references?