-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
cog-whisper-diarization
Cog implementation of transcribing + diarization pipeline with Whisper & Pyannote
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Does this translate to other models or was whisper cherry picked due to it's serial nature and integer math? looking at https://github.com/ml-explore/mlx-examples/tree/main/stable_... seems to hint that this is the case:
>At the time of writing this comparison convolutions are still some of the least optimized operations in MLX.
I think the main thing at play is the fact you can have 64+G of very fast ram directly coupled to the cpu/gpu and the benefits of that from a latency/co-accessibility point of view.
These numbers are certainly impressive when you look at the power packages of these systems.
Worth considering/noting that the cost of m3 max system with the minimum ram config is ~2x the price of a 4090...
How does this compare to insanely-fast-whisper though? https://github.com/Vaibhavs10/insanely-fast-whisper
I think that not using optimizations allows this to be a 1:1 comparison, but if the optimizations are not ported to MLX, then it would still be better to use a 4090.
Having looked at MLX recently, I think it's definitely going to get traction on Macs - and iOS when Swift bindings are released https://github.com/ml-explore/mlx/issues/15 (although there might be some C++20 compilation issue blocking right now).
How does this compare to insanely-fast-whisper though? https://github.com/Vaibhavs10/insanely-fast-whisper
I think that not using optimizations allows this to be a 1:1 comparison, but if the optimizations are not ported to MLX, then it would still be better to use a 4090.
Having looked at MLX recently, I think it's definitely going to get traction on Macs - and iOS when Swift bindings are released https://github.com/ml-explore/mlx/issues/15 (although there might be some C++20 compilation issue blocking right now).
Could someone elaborate how is this accomplished and is there any quality disparity compared to original whisper?
Repos like https://github.com/SYSTRAN/faster-whisper makes immediate sense about why it's faster than the original, but this one, not so much, especially considering it's even much faster.
I'll take this opportunity to ask for help: What's a good open source transcription and diarization app or work flow?
I looked at https://github.com/thomasmol/cog-whisper-diarization and https://about.transcribee.net/ (from the people behind Audapolis) but neither work that well -- crashes, etc.
Thank you!
https://github.com/collabora/WhisperLive
The is another one that uses huggingface's implementation, but I haven't tried it since my spec doesn't support flash-att2