Our great sponsors
-
whisper-timestamped
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Yes. But Whisper's word-level timings are actually quite inaccurate out of the box. There are some Python libraries that mitigate that. I tested several of them. whisper-timestamped seems to be the best one. [0]
[0] https://github.com/linto-ai/whisper-timestamped
NOTE:
The number of mentions on this list indicates mentions on common posts plus user suggested alternatives.
Hence, a higher number means a more popular project.
Related posts
- Easy video transcription and subtitling with Whisper, FFmpeg, and Python
- SOTA ASR Tooling: Long-Form Transcription
- Deploying whisperX on AWS SageMaker as Asynchronous Endpoint
- Weird A.I. Yankovic, a cursed deep dive into the world of voice cloning
- GitHub - MahmoudAshraf97/whisper-diarization: Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper