Whisper: Nvidia RTX 4090 vs. M1 Pro with MLX

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • mlx-examples

    Examples in the MLX framework

  • Does this translate to other models or was whisper cherry picked due to it's serial nature and integer math? looking at https://github.com/ml-explore/mlx-examples/tree/main/stable_... seems to hint that this is the case:

    >At the time of writing this comparison convolutions are still some of the least optimized operations in MLX.

    I think the main thing at play is the fact you can have 64+G of very fast ram directly coupled to the cpu/gpu and the benefits of that from a latency/co-accessibility point of view.

    These numbers are certainly impressive when you look at the power packages of these systems.

    Worth considering/noting that the cost of m3 max system with the minimum ram config is ~2x the price of a 4090...

  • insanely-fast-whisper

  • How does this compare to insanely-fast-whisper though? https://github.com/Vaibhavs10/insanely-fast-whisper

    I think that not using optimizations allows this to be a 1:1 comparison, but if the optimizations are not ported to MLX, then it would still be better to use a 4090.

    Having looked at MLX recently, I think it's definitely going to get traction on Macs - and iOS when Swift bindings are released https://github.com/ml-explore/mlx/issues/15 (although there might be some C++20 compilation issue blocking right now).

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • mlx

    MLX: An array framework for Apple silicon

  • How does this compare to insanely-fast-whisper though? https://github.com/Vaibhavs10/insanely-fast-whisper

    I think that not using optimizations allows this to be a 1:1 comparison, but if the optimizations are not ported to MLX, then it would still be better to use a 4090.

    Having looked at MLX recently, I think it's definitely going to get traction on Macs - and iOS when Swift bindings are released https://github.com/ml-explore/mlx/issues/15 (although there might be some C++20 compilation issue blocking right now).

  • faster-whisper

    Faster Whisper transcription with CTranslate2

  • Could someone elaborate how is this accomplished and is there any quality disparity compared to original whisper?

    Repos like https://github.com/SYSTRAN/faster-whisper makes immediate sense about why it's faster than the original, but this one, not so much, especially considering it's even much faster.

  • cog-whisper-diarization

    Cog implementation of transcribing + diarization pipeline with Whisper & Pyannote

  • I'll take this opportunity to ask for help: What's a good open source transcription and diarization app or work flow?

    I looked at https://github.com/thomasmol/cog-whisper-diarization and https://about.transcribee.net/ (from the people behind Audapolis) but neither work that well -- crashes, etc.

    Thank you!

  • WhisperLive

    A nearly-live implementation of OpenAI's Whisper.

  • https://github.com/collabora/WhisperLive

    The is another one that uses huggingface's implementation, but I haven't tried it since my spec doesn't support flash-att2

  • whisper_streaming

    Whisper realtime streaming for long speech-to-text transcription and translation

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Distil-Whisper: distilled version of Whisper that is 6 times faster, 49% smaller

    14 projects | news.ycombinator.com | 31 Oct 2023
  • Creando Subtítulos Automáticos para Vídeos con Python, Faster-Whisper, FFmpeg, Streamlit, Pillow

    7 projects | dev.to | 29 Apr 2024
  • FLaNK AI-April 22, 2024

    28 projects | dev.to | 22 Apr 2024
  • Show HN: I created automatic subtitling app to boost short videos

    1 project | news.ycombinator.com | 9 Apr 2024
  • karpathy/llm.c

    10 projects | news.ycombinator.com | 8 Apr 2024