Open Source Libraries

This page summarizes the projects mentioned and recommended in the original post on /r/AudioAI

CodeRabbit: AI Code Reviews for Developers
Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
coderabbit.ai
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  1. TTS

    πŸΈπŸ’¬ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

    coqui-ai/TTS

  2. CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
  3. tortoise-tts

    A multi-voice TTS system trained with an emphasis on quality

    neonbjb/tortoise-tts

  4. bark

    πŸ”Š Text-Prompted Generative Audio Model

    suno-ai/bark

  5. piper

    A fast, local neural text to speech system

    rhasspy/piper

  6. Matcha-TTS

    [ICASSP 2024] 🍡 Matcha-TTS: A fast TTS architecture with conditional flow matching

    shivammehta25/Matcha-TTS

  7. whisper

    Robust Speech Recognition via Large-Scale Weak Supervision

    openai/whisper

  8. whisper.cpp

    Port of OpenAI's Whisper model in C/C++

    ggerganov/whisper.cpp

  9. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  10. faster-whisper

    Faster Whisper transcription with CTranslate2

    guillaumekln/faster-whisper

  11. wenet

    Production First and Production Ready End-to-End Speech Recognition Toolkit

    wenet-e2e/wenet

  12. seamless_communication

    Foundational Models for State-of-the-Art Speech and Text Translation

    facebookresearch/seamless_communication: Speech translation

  13. pyannote-audio

    Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

    pyannote/pyannote-audio

  14. PaddleSpeech

    Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

    PaddlePaddle/PaddleSpeech

  15. audio-webui

    A webui for different audio related Neural Networks

    gitmylo/audio-webui

  16. audiocraft

    Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

    facebookresearch/audiocraft/MUSICGEN: Music Generation

  17. jukebox

    Code for the paper "Jukebox: A Generative Model for Music"

    openai/jukebox: Music Generation

  18. Retrieval-based-Voice-Conversion-WebUI

    Easily train a good VC model with voice data <= 10 mins!

    RVC-Project/Retrieval-based-Voice-Conversion-WebUI: Singing Voice Conversion

  19. fish-diffusion

    An easy to understand TTS / SVS / SVC framework

    fishaudio/fish-diffusion: Singing Voice Conversion

  20. demucs

    Discontinued Code for the paper Hybrid Spectrogram and Waveform Source Separation, but the goddamm motherfucker doesn't work.

    facebookresearch/demucs: Stem seperation

  21. ultimatevocalremovergui

    GUI for a Vocal Remover that uses Deep Neural Networks.

    Anjok07/UltimateVocalRemoverGUI: Vocal isolation

  22. DeepFilterNet

    Noise supression using deep filtering

    Rikorose/DeepFilterNet: A Low Complexity Speech Enhancement Framework for Full-Band Audio (48kHz) using on Deep Filtering

  23. PiDTLN

    Apply machine learning model DTLN for noise suppression and acoustic echo cancellation on Raspberry Pi

    SaneBow/PiDTLN: DTLN model for noise suppression and acoustic echo cancellation on Raspberry Pi

  24. versatile_audio_super_resolution

    Versatile audio super resolution (any -> 48kHz) with AudioSR.

    haoheliu/versatile_audio_super_resolution: any -> 48kHz high fidelity Enhancer

  25. basic-pitch

    A lightweight yet powerful audio-to-MIDI converter with pitch bend detection

    spotify/basic-pitch: Audio to midi converter

  26. pedalboard

    πŸŽ› πŸ”Š A Python library for audio.

    spotify/pedalboard: audio effects for Python and TensorFlow

  27. librosa

    Python library for audio and music analysis

    librosa/librosa: Python library for audio and music analysis

  28. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • The Open Music Encyclopedia

    11 projects | news.ycombinator.com | 30 Sep 2024
  • Librosa: Python library for audio and music analysis

    1 project | news.ycombinator.com | 3 Sep 2024
  • Beets 2.0 release: mpd compatible music library manager and MusicBrainz tagger

    1 project | news.ycombinator.com | 17 Jun 2024
  • Beets 2.0 release: mpd compatible music library manager and MusicBrainz tagger

    1 project | news.ycombinator.com | 17 Jun 2024
  • "Unacceptable": Spotify bricking Car Thing devices in Dec. without refunds

    3 projects | news.ycombinator.com | 24 May 2024

Did you know that Python is
the 2nd most popular programming language
based on number of references?