mmf
asteroid
mmf | asteroid | |
---|---|---|
2 | 2 | |
5,417 | 2,111 | |
0.1% | 1.7% | |
5.5 | 5.5 | |
2 months ago | 24 days ago | |
Python | Python | |
GNU General Public License v3.0 or later | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
mmf
-
Context in first comment
mmf, which is a multimodal pytorch framework by facebook research, was released around 2-3 years ago and is now poorly maintained.
-
[N] TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.
How is this different from mmf? https://github.com/facebookresearch/mmf
asteroid
- Speech separation
-
[D] Is it possible to extract certain sounds from a mixture of sounds and noise
Here is a link to a tutorial about that will teach you everything you need to know about source separation, from data, to losses, to commonly used model architectures. That tutorial is built around the nussl source separation library, but some other nice ones exist as well, such as asteroid.
What are some alternatives?
transformers - 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
nussl - A flexible source separation library in Python
smgeo - Geolocation Inference for Reddit
Conv-TasNet - A PyTorch implementation of Conv-TasNet described in "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation" with Permutation Invariant Training (PIT).
CapDec - CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)
Wave-U-Net-for-Speech-Enhancement - Implement Wave-U-Net by PyTorch, and migrate it to the speech enhancement.
mayavoz - Pytorch based speech enhancement toolkit.
Speech-Separation-Paper-Tutorial - A must-read paper for speech separation based on neural networks
multimodal - TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.
voicefilter - Unofficial PyTorch implementation of Google AI's VoiceFilter system
img2dataset - Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.