MacWhisper: Transcribe audio files on your Mac

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • whisper.cpp

    Port of OpenAI's Whisper model in C/C++

  • It runs locally, using Whisper.cpp[1], a Whisper implementation optimized to run on CPU, especially Apple Silicon.

    Whisper itself is open source, and so is that implementation, the OpenAI endpoint is merely a convenience to those who don't wish to host a Whisper server themselves, deal with batching, renting GPUs etc. If you're making a commercial service based on Whisper, the API might be worth it for the convenience, but if you're running it personally and have a good enough machine (an M1 MacBook Air will do), running it locally is usually better.

    [1] https://github.com/ggerganov/whisper.cpp

  • audapolis

    an editor for spoken-word audio with automatic transcription

  • Here's a multi-platform open source app that does the same thing but uses vosk instead of whisper.

    https://github.com/bugbakery/audapolis

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • whisperer

    On-demand prompt-aided voice-to-text with OpenAI's Whisper (by corlinp)

  • I have a Python script on my mac that detects when I press-and-hold the right option key, and records audio while it's pressed. On release, it transcribes it with whispercpp and pastes it. Makes it very easy to record quick voice notes. Here it is: https://github.com/corlinp/whisperer

    I was working on a native version in the form of a taskbar app with customizable prompt and all. However I quickly realized that the behaviors I want the app to do require a bunch of accessibility permissions that would block it from the app store and require more setup steps.

    Would anybody still find that useful?

  • SpeechRecognition

    Speech recognition module for Python, supporting several engines and APIs, online and offline.

  • There is a great library that has support not only with OpenAIs whisper but many others that also work offline. https://github.com/Uberi/speech_recognition

  • whisper-diarization

    Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

  • https://github.com/MahmoudAshraf97/whisper-diarization

    This project has been alright for transcribing audio with speaker diarization. A big finicky. The OpenAI model is better than other paid products(Descript, Riverside) so I’m looking forward to trying MacWhisper.

  • LLMStack

    No-code platform to build LLM Agents, workflows and applications with your data

  • Shameless plug: recently launched LLMStack (https://github.com/trypromptly/LLMStack) and I have some custom pipelines built as apps on LLMStack that I use to transcribe and translate.

    Granted my use cases are not high volume or frequent but being able to take output from Whisper and pipe it to other models has been very powerful for me. It is also amazing how good the quality of Whisper is when handling non English audio.

    We added LocalAI (https://localai.io) support to LLMStack in the last release. Will try to use whisper.cpp and see how that compares for my use cases.

  • whisper

    Robust Speech Recognition via Large-Scale Weak Supervision

  • The OpenAi CLI does that, follow the instructions https://github.com/openai/whisper

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • buzz

    Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts