Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
Tag v1.4.1 has been created, fixes bugs:
https://github.com/ggerganov/whisper.cpp/releases/tag/v1.4.1
Full circle eh. I wonder how well it compares to just trying to use the actual Whisper models on a variety of existing Gpu capable bigger frameworks.
I don't know much practically about how hard it would be to take the Whisper PyTorch (1 or 2?) trained models & to make good use of them elsewhere. I expect Whisper.cpp probably better caters to users, is more readily consumable.
Fwiw, Whisper.cpp uses Nvidia's cuBLAS. There does appear to be an AMD rocm port. https://github.com/ROCmSoftwarePlatform/rocBLAS
What’s the best-in-class Whisper implementation for real-time / streaming transcription? I’ve followed the various posts linked on this GitHub issue [1]; not sure if there’s more out there.
[1] https://github.com/openai/whisper/discussions/2
Works reasonable well in meetings.
https://github.com/davabase/whisper_real_time
There's some info here: https://github.com/ggerganov/llama.cpp/issues/1240