openai-whisper-realtime vs wer_are_we

openai-whisper-realtime

A quick experiment to achieve almost realtime transcription using Whisper. (by tobiashuttinger)

Suggest topics

Source Code

Suggest alternative

Edit details

wer_are_we

Attempt at tracking states of the arts and recent results (bibliography) on speech recognition. (by syhw)

deep-neural-network wer speech-recognition

Source Code

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

openai-whisper-realtime		wer_are_we
	Project
1	Mentions	4
180	Stars	1,862
-	Growth	-
10.0	Activity	1.8
over 1 year ago	Latest Commit	almost 2 years ago
Python	Language
MIT License	License	-

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

openai-whisper-realtime

Posts with mentions or reviews of openai-whisper-realtime. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-09-21.

Whisper – open source speech recognition by OpenAI
22 projects | news.ycombinator.com | 21 Sep 2022

I tried running it in realtime with live audio input (kind of).
You can find the python script in this repo: https://github.com/tobiashuttinger/openai-whisper-realtime

wer_are_we

Posts with mentions or reviews of wer_are_we. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-09-21.

Lichess Voice Recognition Beta is now Live!
1 project | /r/chess | 28 Mar 2023

https://github.com/syhw/wer_are_we https://github.com/Franck-Dernoncourt/ASR_benchmark#benchmark-results
OpenAI Whisper Model Comparison
1 project | news.ycombinator.com | 27 Sep 2022

Great breakdown… with some interesting results and a ton of effort.
Are there any open benchmarks like this for all models that are actually runnable like the data exposed in https://github.com/syhw/wer_are_we but with some of your additional metrics?
Whisper – open source speech recognition by OpenAI
22 projects | news.ycombinator.com | 21 Sep 2022

The authors do explicitly state that they're trying to do a lot of fancy new stuff here, like be multilingual, rather than pursuing just accuracy.
[1] https://github.com/syhw/wer_are_we
This sub is NOT bullying you
3 projects | /r/martialarts | 28 Jan 2022

What are some alternatives?

When comparing openai-whisper-realtime and wer_are_we you can also consider the following projects:

whisper - Robust Speech Recognition via Large-Scale Weak Supervision

plaidml - PlaidML is a framework for making deep learning work everywhere.

DeepSpeech-examples - Examples of how to use or integrate DeepSpeech

mycroft-core - Mycroft Core, the Mycroft Artificial Intelligence platform.

NeMo - A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

py-webrtcvad - Python interface to the WebRTC Voice Activity Detector

dragonfly - Speech recognition framework allowing powerful Python-based scripting and extension of Dragon NaturallySpeaking (DNS), Windows Speech Recognition (WSR), Kaldi and CMU Pocket Sphinx

trashbot - Trashbot helper AI assistant

vosk-api - Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

stable-diffusion - A latent text-to-image diffusion model