asteroid
mmf
asteroid | mmf | |
---|---|---|
2 | 2 | |
2,118 | 5,417 | |
2.0% | 0.1% | |
5.5 | 5.5 | |
28 days ago | 2 months ago | |
Python | Python | |
MIT License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
asteroid
- Speech separation
-
[D] Is it possible to extract certain sounds from a mixture of sounds and noise
Here is a link to a tutorial about that will teach you everything you need to know about source separation, from data, to losses, to commonly used model architectures. That tutorial is built around the nussl source separation library, but some other nice ones exist as well, such as asteroid.
mmf
-
Context in first comment
mmf, which is a multimodal pytorch framework by facebook research, was released around 2-3 years ago and is now poorly maintained.
-
[N] TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.
How is this different from mmf? https://github.com/facebookresearch/mmf
What are some alternatives?
nussl - A flexible source separation library in Python
transformers - 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Conv-TasNet - A PyTorch implementation of Conv-TasNet described in "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation" with Permutation Invariant Training (PIT).
smgeo - Geolocation Inference for Reddit
Wave-U-Net-for-Speech-Enhancement - Implement Wave-U-Net by PyTorch, and migrate it to the speech enhancement.
CapDec - CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)
Speech-Separation-Paper-Tutorial - A must-read paper for speech separation based on neural networks
mayavoz - Pytorch based speech enhancement toolkit.
voicefilter - Unofficial PyTorch implementation of Google AI's VoiceFilter system
multimodal - TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.
img2dataset - Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.