Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 23 Python Asr Projects
-
PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
-
NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
youtube-transcript-api
This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
whisper-timestamped
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
-
whisper.api
This project provides an API with user level access support to transcribe speech to text using a finetuned and processed Whisper ASR model.
-
whisper-standalone-win
Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.
-
AutoSub
A CLI script to generate subtitle files (SRT/VTT/TXT) for any video using either DeepSpeech or Coqui (by abhirooptalasila)
-
edgedict
Working online speech recognition based on RNN Transducer. ( Trained model release available in release )
-
voice100
Voice100 includes neural TTS/ASR models. Inference of Voice100 is low cost as its models are tiny and only depend on CNN without autoregression.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
PaddlePaddle/PaddleSpeech
Project mention: [P] Making a TTS voice, HK-47 from Kotor using Tortoise (Ideally WaveRNN) | /r/MachineLearning | 2023-07-06I don't test WaveRNN but from the ones that I know the best that is open source is FastPitch. And it's easy to use, here is the tutorial for voice cloning.
Project mention: Easy video transcription and subtitling with Whisper, FFmpeg, and Python | news.ycombinator.com | 2024-04-06It uses this, which does support diarization: https://github.com/m-bain/whisperX
Project mention: SpeechBrain 1.0: A free and open-source AI toolkit for all things speech | news.ycombinator.com | 2024-02-28
wenet-e2e/wenet
Project mention: Show HN: AI Dub Tool I Made to Watch Foreign Language Videos with My 7-Year-Old | news.ycombinator.com | 2024-02-28Yes. But Whisper's word-level timings are actually quite inaccurate out of the box. There are some Python libraries that mitigate that. I tested several of them. whisper-timestamped seems to be the best one. [0]
[0] https://github.com/linto-ai/whisper-timestamped
On the other hand, if you need subtitles for a movie that doesn't have some. There are some automated solutions like Whisper that can do a very decent job in most cases : https://github.com/Purfview/whisper-standalone-win
Project mention: Best Speech-to-text API with speaker diarization? | news.ycombinator.com | 2024-05-06
Project mention: Using Whisper to transcribe the entire Forensic Files series | /r/DataHoarder | 2023-06-04
Project mention: My first setup. I always played consols. Last year start building this and iam quite happy! Rate my setup ✌🏼 | /r/PcBuild | 2023-05-11
Python Asr related posts
-
Best Speech-to-text API with speaker diarization?
-
Easy video transcription and subtitling with Whisper, FFmpeg, and Python
-
SOTA ASR Tooling: Long-Form Transcription
-
Deploying whisperX on AWS SageMaker as Asynchronous Endpoint
-
Show HN: AI Dub Tool I Made to Watch Foreign Language Videos with My 7-Year-Old
-
Do you know any quality FastAPI starter projects?
-
Whisper.api: Open-source, self-hosted speech-to-text with fast transcription
-
A note from our sponsor - InfluxDB
www.influxdata.com | 6 May 2024
Index
What are some of the best open-source Asr projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | PaddleSpeech | 10,186 |
2 | NeMo | 10,128 |
3 | whisperX | 9,064 |
4 | speechbrain | 7,914 |
5 | wenet | 3,699 |
6 | lingvo | 2,778 |
7 | youtube-transcript-api | 2,345 |
8 | whisper-timestamped | 1,513 |
9 | SincNet | 1,097 |
10 | pykaldi | 978 |
11 | vosk-server | 843 |
12 | whisper.api | 840 |
13 | whisper-standalone-win | 801 |
14 | AutoSub | 556 |
15 | cheetah | 555 |
16 | pyannote-whisper | 421 |
17 | leopard | 408 |
18 | edgedict | 283 |
19 | whisper-auto-transcribe | 195 |
20 | deepgram-python-sdk | 153 |
21 | spinorama | 93 |
22 | voice100 | 25 |
23 | geppetto | 21 |
Sponsored