Whisper: Nvidia RTX 4090 vs. M1 Pro with MLX

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com
featured
Stream - Scalable APIs for Chat, Feeds, Moderation, & Video.
Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
getstream.io
featured
  1. mlx-examples

    Examples in the MLX framework

    Does this translate to other models or was whisper cherry picked due to it's serial nature and integer math? looking at https://github.com/ml-explore/mlx-examples/tree/main/stable_... seems to hint that this is the case:

    >At the time of writing this comparison convolutions are still some of the least optimized operations in MLX.

    I think the main thing at play is the fact you can have 64+G of very fast ram directly coupled to the cpu/gpu and the benefits of that from a latency/co-accessibility point of view.

    These numbers are certainly impressive when you look at the power packages of these systems.

    Worth considering/noting that the cost of m3 max system with the minimum ram config is ~2x the price of a 4090...

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. insanely-fast-whisper

    How does this compare to insanely-fast-whisper though? https://github.com/Vaibhavs10/insanely-fast-whisper

    I think that not using optimizations allows this to be a 1:1 comparison, but if the optimizations are not ported to MLX, then it would still be better to use a 4090.

    Having looked at MLX recently, I think it's definitely going to get traction on Macs - and iOS when Swift bindings are released https://github.com/ml-explore/mlx/issues/15 (although there might be some C++20 compilation issue blocking right now).

  4. mlx

    MLX: An array framework for Apple silicon

    How does this compare to insanely-fast-whisper though? https://github.com/Vaibhavs10/insanely-fast-whisper

    I think that not using optimizations allows this to be a 1:1 comparison, but if the optimizations are not ported to MLX, then it would still be better to use a 4090.

    Having looked at MLX recently, I think it's definitely going to get traction on Macs - and iOS when Swift bindings are released https://github.com/ml-explore/mlx/issues/15 (although there might be some C++20 compilation issue blocking right now).

  5. faster-whisper

    Faster Whisper transcription with CTranslate2

    Could someone elaborate how is this accomplished and is there any quality disparity compared to original whisper?

    Repos like https://github.com/SYSTRAN/faster-whisper makes immediate sense about why it's faster than the original, but this one, not so much, especially considering it's even much faster.

  6. cog-whisper-diarization

    Cog implementation of transcribing + diarization pipeline with Whisper & Pyannote

    I'll take this opportunity to ask for help: What's a good open source transcription and diarization app or work flow?

    I looked at https://github.com/thomasmol/cog-whisper-diarization and https://about.transcribee.net/ (from the people behind Audapolis) but neither work that well -- crashes, etc.

    Thank you!

  7. WhisperLive

    A nearly-live implementation of OpenAI's Whisper.

    https://github.com/collabora/WhisperLive

    The is another one that uses huggingface's implementation, but I haven't tried it since my spec doesn't support flash-att2

  8. whisper_streaming

    Whisper realtime streaming for long speech-to-text transcription and translation

  9. Stream

    Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.

    Stream logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • I Self-Hosted Llama 3.2 with Coolify on My Home Server: A Step-by-Step Guide

    2 projects | news.ycombinator.com | 16 Oct 2024
  • Distil-Whisper: distilled version of Whisper that is 6 times faster, 49% smaller

    14 projects | news.ycombinator.com | 31 Oct 2023
  • Whispercpp – Local, Fast, and Private Audio Transcription for Ruby

    1 project | news.ycombinator.com | 7 Jun 2025
  • Intel Geti and Intel Geti SDK are open-source

    1 project | news.ycombinator.com | 26 May 2025
  • Wav2Lip: Accurately Lip-Syncing Videos and OpenVINO

    1 project | news.ycombinator.com | 14 May 2025

Did you know that Python is
the 2nd most popular programming language
based on number of references?