Faster Whisper Transcription with CTranslate2

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

tinydiarize

2 364 6.0 Python

Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens

Not that I can see, the developer's roadmap[1] currently is at writing a blogpost about it & trying different sampling methods, expanding to the large model looks to be a long way off (which is a real pity, even if there was just a finetune of large for english that would be a big help over the existing small english finetune).
Could you go into more detail about your workflow? I'd been considering a two-pass approach myself until I discovered tinydiarize mentioned in whisper.cpp's --help text
1: https://github.com/akashmjn/tinydiarize#roadmap

faster-whisper

23 8,899 8.1 Python

Faster Whisper transcription with CTranslate2
InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
whisper-diarization

5 2,019 6.8 Jupyter Notebook

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

The project page mentions whisper-diarization (speaker recognition) as a user of faster-whisper. I've been in the market for that, definitely going to try it out.
https://github.com/MahmoudAshraf97/whisper-diarization

CTranslate2

14 2,825 8.9 C++

Fast inference engine for Transformer models

The original Whisper implementation from OpenAI uses the PyTorch deep learning framework. On the other hand, faster-whisper is implemented using CTranslate2 [1] which is a custom inference engine for Transformer models. So basically it is running the same model but using another backend, which is specifically optimized for inference workloads.
[1] https://github.com/OpenNMT/CTranslate2

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Now I Can Just Print That Video

5 projects | news.ycombinator.com | 4 Dec 2023
Whisper Turbo: transcribe 20x faster than realtime using Rust and WebGPU

3 projects | news.ycombinator.com | 12 Sep 2023
LeMUR: LLMs for Audio and Speech

1 project | news.ycombinator.com | 27 Jul 2023
Faster Whisper Transcription with CTranslate2

1 project | /r/hypeurls | 24 Jul 2023
OpenAI Whisper Audio Transcription Benchmarked on 18 GPUs: Up to 3,000 WPM | Tom's Hardware

1 project | /r/hardware | 11 May 2023

Faster Whisper Transcription with CTranslate2

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Deep Learning Inference quantization speech-recognition speech-to-text
Post date: 20 Jul 2023

tinydiarize

faster-whisper

InfluxDB

whisper-diarization

CTranslate2

Related posts

Now I Can Just Print That Video

Whisper Turbo: transcribe 20x faster than realtime using Rust and WebGPU

LeMUR: LLMs for Audio and Speech