Show HN: Windows port of OpenAI's Whisper automatic speech recognition model

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

Whisper

32 7,126 6.5 C++

High-performance GPGPU inference of OpenAI's Whisper automatic speech recognition (ASR) model (by Const-me)

> What does this mean?
It means you flipped the combobox on the first screen. In the build on github, the only included model implementation is GPU. The other two implementations are disabled with macros, there: https://github.com/Const-me/Whisper/blob/1.1.0/Whisper/stdaf... These implementations are lacking some UX features like callbacks and cancellation, and I haven't tested them for a while, but they might still work.
> does this use both GPU and CPU simultaneously?
No, it's sequential. There's a data dependency between these two stages. The encode function computes some buffers (probably called "cross attention" but I'm not sure, not an ML expert), and then the decode function needs that data to generate the output text.

whisper.cpp

187 30,942 9.8 C

Port of OpenAI's Whisper model in C/C++

This project is a Windows port of the whisper.cpp implementation: https://github.com/ggerganov/whisper.cpp
Which in turn is a C++ port of OpenAI's Whisper automatic speech recognition (ASR) model: https://github.com/openai/whisper
The implementation has no dependencies, usually much faster than realtime, and should hopefully work on most Windows computers in the world.

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
whisper

343 59,916 6.8 Python

Robust Speech Recognition via Large-Scale Weak Supervision

This project is a Windows port of the whisper.cpp implementation: https://github.com/ggerganov/whisper.cpp
Which in turn is a C++ port of OpenAI's Whisper automatic speech recognition (ASR) model: https://github.com/openai/whisper
The implementation has no dependencies, usually much faster than realtime, and should hopefully work on most Windows computers in the world.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project