tango
nuwa-pytorch
tango | nuwa-pytorch | |
---|---|---|
2 | 1 | |
923 | 533 | |
6.4% | - | |
8.7 | 0.0 | |
14 days ago | over 1 year ago | |
Python | Python | |
GNU General Public License v3.0 or later | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
tango
-
[Research] [Project] Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion Model
Found relevant code at https://github.com/declare-lab/tango + all code implementations here
nuwa-pytorch
What are some alternatives?
audio-diffusion-pytorch - Audio generation using diffusion models, in PyTorch.
DALLE2-video - Direct application of DALLE-2 to video synthesis, using factored space-time Unet and Transformers
ai-text-to-audio-latent-diffusion - text-to-audio-latent-diffusion
PaLM-pytorch - Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways
transformers - 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
DALLE-pytorch - Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
word2wave - Word2Wave: a framework for generating short audio samples from a text prompt using WaveGAN and COALA.
phenaki-pytorch - Implementation of Phenaki Video, which uses Mask GIT to produce text guided videos of up to 2 minutes in length, in Pytorch
flamingo-pytorch - Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch
ez-text2video - Easily run text-to-video diffusion with customized video length, fps, and dimensions on 4GB video cards or on CPU.