Python Speech

Open-source Python projects categorized as Speech

Top 23 Python Speech Projects

  • MockingBird

    ๐Ÿš€AIๆ‹Ÿๅฃฐ: 5็ง’ๅ†…ๅ…‹้š†ๆ‚จ็š„ๅฃฐ้Ÿณๅนถ็”Ÿๆˆไปปๆ„่ฏญ้Ÿณๅ†…ๅฎน Clone a voice in 5 seconds to generate arbitrary speech in real-time

  • Scout Monitoring

    Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

    Scout Monitoring logo
  • TTS

    ๐Ÿธ๐Ÿ’ฌ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

  • Project mention: Coqui.ai TTS: A Deep Learning Toolkit for Text-to-Speech | news.ycombinator.com | 2024-06-11

    The license is the MPL, which allows commercial use?

    https://github.com/coqui-ai/TTS/blob/dev/LICENSE.txt

  • datasets

    ๐Ÿค— The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

  • Project mention: ๐Ÿ๐Ÿ 23 issues to grow yourself as an exceptional open-source Python expert ๐Ÿง‘โ€๐Ÿ’ป ๐Ÿฅ‡ | dev.to | 2023-10-19
  • AudioGPT

    AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

  • whisperX

    WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

  • Project mention: Text-to-Speech with Speaker Diarization | news.ycombinator.com | 2024-06-02
  • EmotiVoice

    EmotiVoice ๐Ÿ˜Š: a Multi-Voice and Prompt-Controlled TTS Engine

  • Project mention: FLaNK Stack Weekly 12 February 2024 | dev.to | 2024-02-12
  • modelscope

    ModelScope: bring the notion of Model-as-a-Service to life.

  • Project mention: FLaNK Stack Weekly for 20 June 2023 | dev.to | 2023-06-20

    Model as a Service https://github.com/modelscope/modelscope

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • lingvo

    Lingvo

  • aeneas

    aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)

  • gTTS

    Python library and CLI tool to interface with Google Translate's text-to-speech API

  • Project mention: Using Groq to Build a Real-Time Language Translation App | dev.to | 2024-04-05

    For our real-time TTS needs, we'll employ the fantastic library called gTTS.

  • DeepFilterNet

    Noise supression using deep filtering

  • Project mention: Anyone know of a good TTS pipeline for raw speech data? | /r/AudioAI | 2023-10-03

    You mean remove background noise and transcribe? Then you can use DeepFilterNet to remove noise, and Whisper to transcribe.

  • whisper-timestamped

    Multilingual Automatic Speech Recognition with word-level timestamps and confidence

  • Project mention: Show HN: AI Dub Tool I Made to Watch Foreign Language Videos with My 7-Year-Old | news.ycombinator.com | 2024-02-28

    Yes. But Whisper's word-level timings are actually quite inaccurate out of the box. There are some Python libraries that mitigate that. I tested several of them. whisper-timestamped seems to be the best one. [0]

    [0] https://github.com/linto-ai/whisper-timestamped

  • dc_tts

    A TensorFlow Implementation of DC-TTS: yet another text-to-speech model

  • pykaldi

    A Python wrapper for Kaldi

  • NATSpeech

    A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official PyTorch implementation of PortaSpeech (NeurIPS 2021) and DiffSpeech (AAAI 2022)

  • voicefixer

    General Speech Restoration

  • lhotse

    Tools for handling speech data in machine learning projects.

  • Project mention: Does anyone else find lhotse a pain to use | /r/speechtech | 2023-06-14
  • SALMONN

    SALMONN: Speech Audio Language Music Open Neural Network

  • Project mention: Comparing Humans, GPT-4, and GPT-4V on Abstraction and Reasoning Tasks | news.ycombinator.com | 2023-11-19

    > In other words, if you express a problem in a more complicated space (e.g. a visual problem, or an abstract algebra problem), you will not be able to solve it in the smaller token space, there's not enough information

    You're aware multimodel transformers do exactly this?

    https://github.com/bytedance/SALMONN

  • diffwave

    DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.

  • inaSpeechSegmenter

    CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.

  • Project mention: Listen to HD radio with a $30 RTL SDR dongle | news.ycombinator.com | 2023-11-05

    I have a little hobby project where I record an FM radio music station using a SDR and then remove all the non-music portions for offline listening. I like the music selections the DJs pick, but I prefer not to listen to the DJ commentary and the advertisements.

    I evaluated three methods of recording: analog capture from a standalone FM receiver, using this nrsc5 library to record the "HD" radio stream, and using an AirSpy SDR with this library: https://github.com/jj1bdx/airspy-fmradion

    Recording the "HD" (what a misnomer) radio was nice in that there was no hiss or multipath effects, but in comparison to the other methods the digital compression artifacts became impossible to un-hear. It seems to top out at about 96 kbps

    The airspy-fmradion library has some nice stuff in it to address multipath, resulting in the best audio quality of the three methods I tested.

    I use https://github.com/ina-foss/inaSpeechSegmenter to identify which segments of the recordings are speech vs. music.

  • Speech-enhancement

    Deep learning for audio denoising

  • allosaurus

    Allosaurus is a pretrained universal phone recognizer for more than 2000 languages

  • StarGANv2-VC

    StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Speech discussion

Log in or Post with

Python Speech related posts

  • Coqui.ai TTS: A Deep Learning Toolkit for Text-to-Speech

    6 projects | news.ycombinator.com | 11 Jun 2024
  • Text-to-Speech with Speaker Diarization

    1 project | news.ycombinator.com | 2 Jun 2024
  • Easy video transcription and subtitling with Whisper, FFmpeg, and Python

    1 project | news.ycombinator.com | 6 Apr 2024
  • Using Groq to Build a Real-Time Language Translation App

    3 projects | dev.to | 5 Apr 2024
  • OpenAI deems its voice cloning tool too risky for general release

    1 project | news.ycombinator.com | 31 Mar 2024
  • SOTA ASR Tooling: Long-Form Transcription

    1 project | news.ycombinator.com | 31 Mar 2024
  • Deploying whisperX on AWS SageMaker as Asynchronous Endpoint

    2 projects | dev.to | 31 Mar 2024
  • A note from our sponsor - Scout Monitoring
    www.scoutapm.com | 14 Jun 2024
    Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today. Learn more โ†’

Index

What are some of the best open-source Speech projects in Python? This list will help you:

Project Stars
1 MockingBird 34,223
2 TTS 31,005
3 datasets 18,647
4 AudioGPT 9,849
5 whisperX 9,686
6 EmotiVoice 6,542
7 modelscope 6,297
8 lingvo 2,792
9 aeneas 2,379
10 gTTS 2,173
11 DeepFilterNet 2,050
12 whisper-timestamped 1,630
13 dc_tts 1,150
14 pykaldi 983
15 NATSpeech 944
16 voicefixer 940
17 lhotse 884
18 SALMONN 847
19 diffwave 728
20 inaSpeechSegmenter 705
21 Speech-enhancement 603
22 allosaurus 524
23 StarGANv2-VC 461

Sponsored
Free Django app performance insights with Scout Monitoring
Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
www.scoutapm.com