-
whisper.api
This project provides an API with user level access support to transcribe speech to text using a finetuned and processed Whisper ASR model.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
willow-inference-server
Open source, local, and self-hosted highly optimized language inference server supporting ASR/STT, TTS, and LLM across WebRTC, REST, and WS
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Nice! This will be very useful for me. Think I can run this locally can spin a basic telegram bot around it for personal use.
One issue I faced with all the whisper based transcript generators is that there seems to be no good way to make editing/correcting the generated text with word level timestamp. I created a small web based tool[0] for that.
By any chance if anyone is looking to edit transcripts generated using whisper, you'd probably find it useful.
[0] https://github.com/geekodour/wscribe-editor
One caveat here is that whisper.cpp does not offer any CUDA support at all, acceleration is only available for Apple Silicon.
If you have Nvidia hardware the ctranslate2 based faster-whisper is very very fast: https://github.com/guillaumekln/faster-whisper
ctranslate2 is incredible, I don’t know why it doesn’t get more attention.
We use it for our Willow Inference Server which has an API that can be used directly like OP project and supports all Whisper models, TTS, etc:
https://github.com/toverainc/willow-inference-server
The benchmarks are pretty incredible (largely thanks to ctranslate2).
Related posts
-
Distil-Whisper: distilled version of Whisper that is 6 times faster, 49% smaller
-
[D] What is the most efficient version of OpenAI Whisper?
-
VLLM: 24x faster LLM serving than HuggingFace Transformers
-
Show HN: Willow Inference Server: Optimized ASR/TTS/LLM for Willow/WebRTC/REST
-
Easy video transcription and subtitling with Whisper, FFmpeg, and Python