SpecVQGAN Alternatives
Similar projects and alternatives to SpecVQGAN
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
vid2cleantxt
Python API & command-line tool to easily transcribe speech-based video files into clean text
-
MoViNet-pytorch
MoViNets PyTorch implementation: Mobile Video Networks for Efficient Video Recognition;
-
awesome-python-applications
💿 Free software that works great, and also happens to be open-source Python.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
WOLOF-ASR-Wav2Vec2
Audio Preprocessing and finetuning of wav2vec2-large-xlsr model on AI4D Baamtu Datamation - Automatic Speech Recognition in WOLOF Data.
-
nn
🧑🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
-
dressing-in-order
(ICCV'21) Official code of "Dressing in Order: Recurrent Person Image Generation for Pose Transfer, Virtual Try-on and Outfit Editing" by Aiyu Cui, Daniel McKee and Svetlana Lazebnik
SpecVQGAN reviews and mentions
-
Text-to-Audio Generation Using Instruction Tuned LLM and Latent Diffusion Model
Excellent. Some of the theory here goes back to Oct/2021 and beyond [1].
The riffusion.com [2] guys made this practical. Also, my video of high-level overview and examples [3].
1. SpecVQGAN: https://github.com/v-iashin/SpecVQGAN
2. Riffusion: ://www.riffusion.com/
3. Riffusion high-level overview: https://youtu.be/olkLVGcvib8
- "Taming Visually Guided Sound Generation". Quickly generate audio matching a given video. Code includes a Google Colab.
Stats
v-iashin/SpecVQGAN is an open source project licensed under MIT License which is an OSI approved license.
The primary programming language of SpecVQGAN is Jupyter Notebook.
Popular Comparisons
Sponsored