Our great sponsors
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
- https://github.com/kaldi-asr/kaldi
- https://github.com/espnet/espnet
- https://github.com/speechbrain/speechbrain
- https://github.com/NVIDIA/NeMo
- https://github.com/microsoft/UniSpeech
- https://github.com/BUTSpeechFIT/VBx
Have you tried https://github.com/scart97/thunder-speech? It's a smaller repo that is based off of NeMo, but meant to be for more flexible experimentation, and is compatible with huggingface's transformers library.