Python speech-to-text

Open-source Python projects categorized as speech-to-text

Top 23 Python speech-to-text Projects

  • NeMo

    NeMo: a framework for generative AI

    Project mention: [P] Making a TTS voice, HK-47 from Kotor using Tortoise (Ideally WaveRNN) | /r/MachineLearning | 2023-07-06

    I don't test WaveRNN but from the ones that I know the best that is open source is FastPitch. And it's easy to use, here is the tutorial for voice cloning.

  • whisperX

    WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

    Project mention: FLaNK 15 Jan 2024 | dev.to | 2024-01-15
  • Onboard AI

    ChatGPT with full context of any GitHub repo. Onboard AI learns any GitHub repo in minutes and lets you chat with it to locate functionality, understand different parts, and generate new code. Use it for free at app.getonboardai.com.

  • SpeechRecognition

    Speech recognition module for Python, supporting several engines and APIs, online and offline.

    Project mention: help with script (beginner) | /r/learnpython | 2023-12-07

    Start and Stop Listening Example

  • faster-whisper

    Faster Whisper transcription with CTranslate2

    Project mention: Whisper: Nvidia RTX 4090 vs. M1 Pro with MLX | news.ycombinator.com | 2023-12-13

    Could someone elaborate how is this accomplished and is there any quality disparity compared to original whisper?

    Repos like https://github.com/SYSTRAN/faster-whisper makes immediate sense about why it's faster than the original, but this one, not so much, especially considering it's even much faster.

  • speechbrain

    A PyTorch-based Speech Toolkit

    Project mention: FLaNK Stack Weekly 22 January 2024 | dev.to | 2024-01-22
  • pyvideotrans

    Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,并添加配音

    Project mention: FLaNK Stack Weekly 06 Nov 2023 | dev.to | 2023-11-06
  • lingvo

    Lingvo

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • kalliope

    Kalliope is a framework that will help you to create your own personal assistant.

  • whisper-asr-webservice

    OpenAI Whisper ASR Webservice API

    Project mention: How I converted a podcast into a knowledge base using Orama search and OpenAI whisper and Astro | dev.to | 2023-05-23
  • Dragonfire

    the open-source virtual assistant for Ubuntu based Linux distributions

  • whisper-timestamped

    Multilingual Automatic Speech Recognition with word-level timestamps and confidence

    Project mention: AI-assisted removal of filler words from video recordings | dev.to | 2023-11-01

    whisper-timestamped, which is a layer on top of the Whisper set of models enabling us to get accurate word timestamps and include filler words in transcription output. This transcriber downloads the selected Whisper model to the machine running the demo and no third-party API keys are required.

  • dc_tts

    A TensorFlow Implementation of DC-TTS: yet another text-to-speech model

  • nonoCAPTCHA

    An asynchronized Python library to automate solving ReCAPTCHA v2 using audio

  • whisper-playground

    Build real time speech2text web apps using OpenAI's Whisper https://openai.com/blog/whisper/

  • whisper-ctranslate2

    Whisper command line client compatible with original OpenAI client based on CTranslate2.

    Project mention: Firefox slow to load YouTube? Just another front in Google's war on ad blockers | news.ycombinator.com | 2023-12-12

    Much better, actually. Try the large-v3 model, it's great. I use it via whisper-ctranslate2 which is a faster implementation.

    https://github.com/Softcatala/whisper-ctranslate2

  • whisper-standalone-win

    Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.

    Project mention: Question : is this a movie only tracker? | /r/Karagarga | 2023-07-03

    On the other hand, if you need subtitles for a movie that doesn't have some. There are some automated solutions like Whisper that can do a very decent job in most cases : https://github.com/Purfview/whisper-standalone-win

  • AI-Waifu-Vtuber

    AI Vtuber for Streaming on Youtube/Twitch

    Project mention: AI VTUBER | /r/VirtualYoutubers | 2023-04-04
  • speech-to-text-benchmark

    speech to text benchmark framework

    Project mention: Speech-to-Text Benchmark | news.ycombinator.com | 2024-01-16
  • cheetah

    On-device streaming speech-to-text engine powered by deep learning (by Picovoice)

  • AutoSub

    A CLI script to generate subtitle files (SRT/VTT/TXT) for any video using either DeepSpeech or Coqui (by abhirooptalasila)

  • leopard

    On-device speech-to-text engine powered by deep learning

  • edenai-apis

    Eden AI: simplify the use and deployment of AI technologies by providing a unique API that connects to the best possible AI engines

    Project mention: We're Building an Open-Source LLM/AI API Wrapper: Here's Why | news.ycombinator.com | 2023-08-28

    HackerNoon featured our latest article in the "Future of AI" category

    We explain how Eden AI contributes to the AI ecosystem in structuring AI and LLM APIs by creating the most accomplished Open-Source wrapper possible.

    You can support us in reaching 1000 stars on Github here: https://github.com/edenai/edenai-apis

  • kaldi-active-grammar

    Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time

    Project mention: Ask HN: How do you get started with adding voice commands to a computer system? | news.ycombinator.com | 2023-11-21

    https://github.com/dictation-toolbox/dragonfly

    https://github.com/daanzu/kaldi-active-grammar

  • WorkOS

    The modern API for authentication & user identity. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2024-01-22.

Python speech-to-text related posts

Index

What are some of the best open-source speech-to-text projects in Python? This list will help you:

Project Stars
1 NeMo 9,289
2 whisperX 7,965
3 SpeechRecognition 7,898
4 faster-whisper 7,402
5 speechbrain 7,275
6 pyvideotrans 3,813
7 lingvo 2,776
8 kalliope 1,687
9 whisper-asr-webservice 1,412
10 Dragonfire 1,372
11 whisper-timestamped 1,330
12 dc_tts 1,147
13 nonoCAPTCHA 893
14 whisper-playground 732
15 whisper-ctranslate2 668
16 whisper-standalone-win 588
17 AI-Waifu-Vtuber 581
18 speech-to-text-benchmark 576
19 cheetah 539
20 AutoSub 531
21 leopard 395
22 edenai-apis 332
23 kaldi-active-grammar 327
The modern API for authentication & user identity.
The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
workos.com