Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more β
Top 4 Melgan Open-Source Projects
-
TTS
:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts) (by mozilla)
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
TensorFlowTTS
:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
Project mention: Ask HN: Open-source, local Text-to-Speech (TTS) generators | news.ycombinator.com | 2024-05-07I just noticed that https://coqui.ai/ is "Shutting down".
I'm building a web app (React / Django) which takes a list of affirmations & goals (in Markdown files), puts them into a database (SQlite), and uses voice synthesis to create voice audio files of the phrases. These are combined with a relaxed backing track (ffmpeg), made into playlists of 10-20 phrases (randomly sampled, or according to a theme: "mind" "body" "soul") and then play automatically in the morning & evening (cron). This allows you to persistently hear & vocalize your own goals & good vibes over time.
I had been planning to use Coqui TTS as the local text-to-speech engine, but with this cancellation, I'd love to hear from the community what is a great open-source, local text-to-speech engine?
Generally, I learn both the highest quality commercially available technology (example: ElevenLabs), and also the best open-source equivalent. Would love to hear suggestions & perspectives on this. What voice synth tools are you investing your time into learning & building with?
Coqui-ai was a commercial continuation of Mozilla TTS and STT (https://github.com/mozilla/TTS).
At the time (2018-ish), it was really impressive for on-device voice synthesis (with a quality approaching the Google and Azure cloud-based voice synthesis options) and open source, so a lot of people in the FOSS community were hoping it could be used for a privacy-respecting home assistant, Linux speech synthesis that doesn't suck, etc.
After Mozilla abandoned the project, Coqui continued development and had some really impressive one-shot voice cloning, but pivoted to marketing speech synthesis for game developers. They were probably having trouble monetizing it, and it doesn't surprise me that they shut down.
An equivalent project that's still in active development and doing really well is Piper TTS (https://github.com/rhasspy/piper).
Hey HN, has anyone found a viable solution for doing this locally and offline on iOS? I'd like to offer a privacy-friendly text to speech feature to my App, and Apple's speech synthesis sounds awful compared to some newer models and TTS engines. The only thing I've found is an older TensorflowTTS example here: https://github.com/TensorSpeech/TensorFlowTTS/tree/master/examples/ios
Any pointers or tips appreciated.
Melgan related posts
-
Ask HN: On-Device Text to Speech
-
TTS mobile help
-
A Working TTS feature has been found (No Google Services Required)
-
Free library for text-to-speech
-
Reviving the 1973 Unix text to voice translator
-
A note from our sponsor - InfluxDB
www.influxdata.com | 8 May 2024
Index
What are some of the best open-source Melgan projects? This list will help you:
Project | Stars | |
---|---|---|
1 | TTS | 29,631 |
2 | TTS | 8,821 |
3 | TensorFlowTTS | 3,710 |
4 | SpecVQGAN | 318 |
Sponsored