NeMo Alternatives

Similar projects and alternatives to NeMo

stable-diffusion

382 65,389 0.0 Jupyter Notebook NeMo VS stable-diffusion

A latent text-to-image diffusion model
whisper

343 60,303 6.4 Python NeMo VS whisper

Robust Speech Recognition via Large-Scale Weak Supervision
WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
TTS

231 29,174 9.5 Python NeMo VS TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
whisper.cpp

187 30,942 9.8 C NeMo VS whisper.cpp

Port of OpenAI's Whisper model in C/C++
tortoise-tts

144 11,755 8.2 Jupyter Notebook NeMo VS tortoise-tts

A multi-voice TTS system trained with an emphasis on quality
DeepSpeech

67 24,278 0.0 C++ NeMo VS DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
common-voice

66 3,247 10.0 TypeScript NeMo VS common-voice

Common Voice is part of Mozilla's initiative to help teach machines how real people speak.
InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
TTS

62 8,806 0.0 Jupyter Notebook NeMo VS TTS

:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts) (by mozilla)
vosk-api

59 7,025 5.9 Jupyter Notebook NeMo VS vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
equinox

31 1,809 9.2 Python NeMo VS equinox

Elegant easy-to-use neural networks + scientific computing in JAX. https://docs.kidger.site/equinox/
tacotron2

28 4,890 0.0 Jupyter Notebook NeMo VS tacotron2

Tacotron 2 - PyTorch implementation with faster-than-realtime inference
speechbrain

26 7,869 9.8 Python NeMo VS speechbrain

A PyTorch-based Speech Toolkit
common-voice-android

26 107 2.9 Kotlin NeMo VS common-voice-android

Repository of "CV Project" app. It's an unofficial app for Mozilla Common Voice, which permits you to contribute to this project via your device.
espnet

15 7,872 10.0 Python NeMo VS espnet

End-to-End Speech Processing Toolkit
larynx

18 788 0.0 Python NeMo VS larynx

Discontinued End to end text to speech system using gruut and onnx
plaidml

14 4,574 5.4 C++ NeMo VS plaidml

PlaidML is a framework for making deep learning work everywhere.
pyannote-audio

15 5,027 8.6 Jupyter Notebook NeMo VS pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
PaddleSpeech

6 10,120 7.6 Python NeMo VS PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
Real-Time-Voice-Cloning

96 50,738 0.0 Python NeMo VS Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time
STT

11 2,131 0.6 C++ NeMo VS STT

🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
SaaSHub

www.saashub.com sponsored

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better NeMo alternative or higher similarity.

Suggest an alternative to NeMo

NeMo reviews and mentions

Posts with mentions or reviews of NeMo. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-07-06.

[P] Making a TTS voice, HK-47 from Kotor using Tortoise (Ideally WaveRNN)
2 projects | /r/MachineLearning | 6 Jul 2023

I don't test WaveRNN but from the ones that I know the best that is open source is FastPitch. And it's easy to use, here is the tutorial for voice cloning.
[N] Huggingface/nvidia release open source GPT-2B trained on 1.1T tokens
1 project | /r/MachineLearning | 2 May 2023
[D] What is the best open source text to speech model?
15 projects | /r/MachineLearning | 13 Apr 2023
[D] JAX vs PyTorch in 2023
5 projects | /r/MachineLearning | 9 Mar 2023

Nowadays... bigger repos like https://github.com/NVIDIA/NeMo are all pytorch, lots of work also published by Meta and Microsoft is all torch. I check new work on GitHub all the time and I haven't seen a Tensorflow repo in years except one.
[D] What's stopping you from working on speech and voice?
7 projects | /r/MachineLearning | 30 Jan 2023

- https://github.com/NVIDIA/NeMo
Can I use PyTorch to build a fast capitalization recoverer?
1 project | /r/pytorch | 21 Nov 2022

Can’t you use the NeMo model and just strip the punctuation from the output again if you don’t want it? You can also fine tune the the model with capitalization only if you look at the examples https://github.com/NVIDIA/NeMo/blob/stable/tutorials/nlp/Punctuation_and_Capitalization.ipynb The capitalization and punctuation are annotated separately (U indicates that the word should be upper cased, and O - no capitalization ). The model seems to be a token level classifier not seq to seq so there should also be a way to get just the capitalization part but you would have to look into the model as it’s not shown in the examples.
I made a free transcription service powered by Whisper AI
8 projects | news.ycombinator.com | 18 Nov 2022

I think there's been talk to do speaker diarization with whisper-asr-webservice[0] which is also written in python and should be able to make use of goodies such as pyannote-audio, py-webrtcvad, etc.
Whisper is great but at the point we get to kludging various things together it starts to make more sense to use something like Nvidia NeMo[1] which was built with all of this in mind and more
[0] - https://github.com/ahmetoner/whisper-asr-webservice
[1] - https://github.com/NVIDIA/NeMo
Mozilla Common Voice - Korean Language is live - Help Build a Korean Corpus for Training AI/Navi/etc
2 projects | /r/Korean | 4 Nov 2022

[커먼보이스 전자우편](mailto:[email protected]) || Common Voice || Korean Language Homepage || FAQs || Speaking Aloud and Reviewing Recordings || Sentence Collector || NVidia/NeMo
Whisper – open source speech recognition by OpenAI
22 projects | news.ycombinator.com | 21 Sep 2022
Using Edge Biometrics For Better AI Security System Development
3 projects | dev.to | 31 Aug 2022

The final security grain was added with speech-to-text anti-spoofing built on QuartzNet from the Nemo framework. This model provides a decent quality user experience and is suitable for real-time scenarios. To measure how close what the person says to what the system expects, requires calculation of the Levenshtein distance between them.
A note from our sponsor - SaaSHub
www.saashub.com | 26 Apr 2024

SaaSHub helps you find the best software and product alternatives Learn more →

Stats

Basic NeMo repo stats

Mentions

Stars

10,021

Activity

9.8

Last Commit

5 days ago

NVIDIA/NeMo is an open source project licensed under Apache License 2.0 which is an OSI approved license.

The primary programming language of NeMo is Python.

Popular Comparisons