Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
I wanted to make a human-like reading feature for our language-learning software. Training a model isn't too hard using something like https://github.com/coqui-ai/TTS.
The weak link was the available free/open datasets. You needed a single speaker with a pleasant voice, 20hrs+ material from varied sources, recorded in a good recording enviroment with a good mic etc. For English, the default was LJSpeech, which doesn't fulfill all these requirements. I say 'was', as I haven't followed developments recently.
Last year we decided to make our own dataset with a Irish woman, Jenny. She has a soft Irish lilt.
Never got around around to training the model, but I will upload the raw audio and prompts here in a few hours (need to pay my internet bill in town..):
https://github.com/dioco-group/jenny-tts-dataset/blob/main/R...
I wanted to make a human-like reading feature for our language-learning software. Training a model isn't too hard using something like https://github.com/coqui-ai/TTS.
The weak link was the available free/open datasets. You needed a single speaker with a pleasant voice, 20hrs+ material from varied sources, recorded in a good recording enviroment with a good mic etc. For English, the default was LJSpeech, which doesn't fulfill all these requirements. I say 'was', as I haven't followed developments recently.
Last year we decided to make our own dataset with a Irish woman, Jenny. She has a soft Irish lilt.
Never got around around to training the model, but I will upload the raw audio and prompts here in a few hours (need to pay my internet bill in town..):
https://github.com/dioco-group/jenny-tts-dataset/blob/main/R...
You'll need to install https://github.com/yt-dlp/yt-dlp#installation before you can use that. As you can see, the "script" is just so to add a options `-x` (extract audio) and `--audio-format mp3` to convert to mp3 in the end.
Related posts
- OpenAI deems its voice cloning tool too risky for general release
- What things are happening in ML that we can't hear oer the din of LLMs?
- Doom Running on a Toothbrush
- Show HN: I create a free website for download YouTube transcript, subtitle
- Base TTS (Amazon): The largest text-to-speech model to-date