Espnet Alternatives

Similar projects and alternatives to espnet

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better espnet alternative or higher similarity.

espnet reviews and mentions

Posts with mentions or reviews of espnet. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-04-26.
  • [P] TorToiSe - a true zero-shot multi-voice TTS engine
    3 projects | reddit.com/r/MachineLearning | 26 Apr 2022
    CMU WavLab has ESPNet https://espnet.github.io/espnet/ which includes a number of high quality TTS models including VITS (which in my subjective experience is just as good as what is demonstrated here). Also the inference on various ESPNet pretrained TTS models is reasonable and sentences take on average 5 seconds per word to generate the waveform on my totally mid PC setup.
  • Help picking a good speech recognition library
    3 projects | reddit.com/r/learnpython | 1 Dec 2021
    https://github.com/espnet/espnet (kind of like a newer Kaldi, but also not beginner friendly)
  • speechbrain VS espnet - a user suggested alternative
    2 projects | 13 Oct 2021
    both provide e2e ASR support but espnet does have more utilities where as speechbarain is clean
  • Need help with training ASR model from scratch.
    3 projects | reddit.com/r/speechtech | 26 Mar 2021
    This is relatively small amount of speech to train the model from scratch, but you can train using another pre-trained model for initialization. There are numbers of end-to-end ASR toolkits which can be used for this: https://github.com/NVIDIA/NeMo and https://github.com/espnet/espnet
    3 projects | reddit.com/r/speechtech | 26 Mar 2021
    You actually dont need to have phone level alignment for your data. Both hybrid and end-2-end approaches can work with utterance level alignment. For the hybrid approach, you would need a lexicon which maps each unique word in your training transcription to its phone sequence. You can obtain this with CMU's tool. For end-2-end approach you will need a byte pair encoder to tokenize the words in the transcriptions to its sub-words.
  • Is there a python based speaker diarization system you would recommend?
    2 projects | reddit.com/r/LanguageTechnology | 14 Mar 2021
    Have a look at this PR at ESPnet. It might be useful.
  • What are some good speech recognition papers I can implement?
    3 projects | reddit.com/r/MLQuestions | 1 Feb 2021
    espnet
  • Downpour: DRM Free Audiobooks
    4 projects | news.ycombinator.com | 17 Jan 2021
    I really like google text to speech and use it for my own custom audiobooks, I've tried google's microsoft's, IBM's, and a few other research ones. IBM's sounds slightly better but has a much more restricted free monthly tier, google's and microsoft's has 1 million free characters per month which goes pretty far.

    Like others are saying, it's slightly robotic but I've started to listen a ton by TTS and you definitely get used to it (you even start to hear inflection in it, which is cool). I use android smart audiobook app and you can control the sound levels, turning down the high pitch aspects also helps to make it easier to listen to for longer periods of time

    For HN folks, there are some pretty reasonable research projects, especially by nvidia (glownet) which you can run yourself. They sound relatively similar but the training voices are much more restricted and not as good. If anyone knows of a github/etc with a nicer diy TTS I'd be interested. The best I've seen which is customizable is https://github.com/espnet/espnet but I had trouble getting it to work, then getting it to sound ok

    (For anyone else going DIY I'll warn you that the failure modes for TTS is some unerring frankly creepy sounds. Google's TTS fails very well, even for strange words, and when it gets very confused it spells it out. Some of the research ones go into haunting unrelated syllables, sometimes repeating for 10s of seconds

  • A note from our sponsor - talent.io
    www.talent.io | 27 Sep 2022
    Median salaries, most in-demand technologies, state of the remote work... all you need to know your worth on the market by tech recruitment platform talent.io Learn more →

Stats

Basic espnet repo stats
10
5,486
9.9
7 days ago

espnet/espnet is an open source project licensed under Apache License 2.0 which is an OSI approved license.

Download talent.io’s Tech Salary Report
Median salaries, most in-demand technologies, state of the remote work... all you need to know your worth on the market by tech recruitment platform talent.io
www.talent.io
Find remote Python jobs at our new job board 99remotejobs.com. There is 1 new remote job listed recently.
Are you hiring? Post a new remote job listing for free.