DeepFilterNet
audio-webui
DeepFilterNet | audio-webui | |
---|---|---|
10 | 15 | |
1,969 | 916 | |
- | - | |
8.9 | 9.0 | |
9 days ago | about 1 month ago | |
Python | Python | |
GNU General Public License v3.0 or later | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
DeepFilterNet
-
Anyone know of a good TTS pipeline for raw speech data?
You mean remove background noise and transcribe? Then you can use DeepFilterNet to remove noise, and Whisper to transcribe.
-
Open Source Libraries
Rikorose/DeepFilterNet: A Low Complexity Speech Enhancement Framework for Full-Band Audio (48kHz) using on Deep Filtering
- DeepFilterNet: Noise supression using deep filtering
-
Linux Audio Noise suppression using deep filtering in Rust
It looks like the library in Rust is using `tract-onnx` to do the inference: https://github.com/Rikorose/DeepFilterNet/blob/2a84d2a1750a5... I am wondering whether using Python for research, training in big data center, and Rust at edge for efficient inference would be a trend in the future. We do have a larger community of C++ right now for inference (e.g. ggml). But Rust crate as component to build applications of AI is joy to use.
-
Real-Time Noise Suppression for PipeWire writen in Rust
Repo: https://github.com/Rikorose/DeepFilterNet
audio-webui
-
Sub for AI voice models
I mean, just use gitmylo's repo.
-
What are some good tools for text2audio that I can run locally?
For pure voice and not autogeneration from the LLM you have stuff like: https://github.com/gitmylo/audio-webui
-
Open Source Libraries
gitmylo/audio-webui
-
Dedicated Riffusion Gradio training interface?
I was wondering if there might be some way to incorporate Riffusion and it's various capabilities into this platform? Multiple attempts have been made by me on my local server to combine the Automatic111 SD-Web-UI extensions and such into the Audiocraft_Plus (https://github.com/GrandaddyShmax/audiocraft_plus) and Audio Web (https://github.com/gitmylo/audio-webui) Ui's platform, but truth be told I am a total beginner and keep coming up short!
-
Any local voice models?
audio-webui is the stable diffusion of txt 2 speech stuff but don't expect high quality voice replication for a while. https://github.com/gitmylo/audio-webui
-
Best Tool for creating an AI celebrity voice clone?
You can try Audio-Webui if you're technically savvy. There are some voice cloning workflows as well as RVC, voice conversion.
-
Are there any AI resources to help create audiobooks from text to speech?
Have not tested but it looks like the audio-webui repo is ready for long texts (just click the COLAB link to test it). I would test it and then go tortoise if the quality is not as needed.
-
I found a youtube tutorial voiceover made by AI, and I'm blown away by its quality. Can you help me figure out which tool did the author use?
This is the best open source voice cloning. Super easy to install also.
-
How to change your voice to someone elseโs for a song? What are the best ways being used right now?
People use https://github.com/gitmylo/audio-webui and https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI for that Check out this tutorial : https://www.youtube.com/watch?v=-JcvdDErkAU It's possible to separate music or background noises from voice with these tech and recombine them together or with other songs, it's amazing and fun.
-
What would be the Stable Diffusion equivalent, for AI music generation?
Check this out : https://github.com/gitmylo/audio-webui/wiki/Features
What are some alternatives?
NoiseTorch - Real-time microphone noise suppression on Linux.
tortoise-tts - A multi-voice TTS system trained with an emphasis on quality
noise-repellent - Lv2 suite of plugins for broadband noise reduction
TTS - ๐ธ๐ฌ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
PiDTLN - Apply machine learning model DTLN for noise suppression and acoustic echo cancellation on Raspberry Pi
audiocraft_plus - Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
wenet - Production First and Production Ready End-to-End Speech Recognition Toolkit
bark - ๐ Text-Prompted Generative Audio Model
rnnoise - Recurrent neural network for audio noise reduction
Retrieval-based-Voice-Conversion-WebUI - Easily train a good VC model with voice data <= 10 mins!
noise-suppression-for-voice - Noise suppression plugin based on Xiph's RNNoise
bark-voice-cloning-HuBERT-quantizer - The code for the bark-voicecloning model. Training and inference.