Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Hifi-gan Alternatives
Similar projects and alternatives to hifi-gan
-
tortoise-tts
A multi-voice TTS system trained with an emphasis on quality
-
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
tacotron2
Tacotron 2 - PyTorch implementation with faster-than-realtime inference
-
-
-
-
Parallel-Tacotron2
PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
diffwave
DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.
-
PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
-
vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
-
flowtron
Flowtron is an auto-regressive flow-based generative network for text to speech synthesis with control over speech variation and style transfer
-
TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
-
TensorFlowTTS
:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
-
FastSpeech2
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
-
mlp-singer
Official implementation of MLP Singer: Towards Rapid Parallel Korean Singing Voice Synthesis (IEEE MLSP 2021)
-
tacotron
A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)
-
STYLER
Official repository of STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech, INTERSPEECH 2021 (by keonlee9420)
-
so-vits-svc-fork
so-vits-svc fork with realtime support, improved interface and more features.
-
-
Speech-Backbones
This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
hifi-gan reviews and mentions
- [D] What is the best open source text to speech model?
- I made Lisa-nee TTS (Imai Lisa)
- HiFi-GAN: Generative Adversarial Networks for Efficient and Hi-Fi Speech Synth
-
[2108.13320] Neural HMMs are all you need (for high-quality attention-free TTS)
It will be interesting to see if the artefacts you noticed persist once we've trained the model for longer and switch to a better vocoder such as HiFi-GAN. (The paper and audio examples use WaveGlow since that's the default of the repository we compared ourselves to.) That said, "choppiness" sounds to me like it might be related to the temporal evolution, in which case it's something that a non-causal, convolutional post-net might be able to smooth over.
-
The dangers of AI
Hey, as far as I know this paper is the current SoTA on public data that is open source. Github is here. If you are interested in really getting into speech synthesis, this page has everything (modern stuff on the bottom.)
-
A note from our sponsor - InfluxDB
www.influxdata.com | 19 Apr 2024
Stats
jik876/hifi-gan is an open source project licensed under MIT License which is an OSI approved license.
The primary programming language of hifi-gan is Python.