RetinaFace: Deep Face Detection Library for Python
For face detection, we used the RetinaFace model with a MobileNet backbone from the InsightFace project. This model outputs four coordinates for each detected face on an image as well as 5 facial landmarks. The fact that images captured at different angles or with different optics can change the proportions of the face due to distortion. This may cause the model to struggle identifying the person.
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
The previous model with Jasper architecture was not able to verify the recordings of the same person taken from different microphones. So we solved this problem by using ECAPA-TDNN architecture, which was trained on VoxCeleb2 dataset from the SpeechBrain framework which did a better job at verifying employees.
Build time-series-based applications quickly and at scale.. InfluxDB is the Time Series Data Platform where developers build real-time applications for analytics, IoT and cloud-native services in less time with less code.
NeMo: a toolkit for conversational AI
The final security grain was added with speech-to-text anti-spoofing built on QuartzNet from the Nemo framework. This model provides a decent quality user experience and is suitable for real-time scenarios. To measure how close what the person says to what the system expects, requires calculation of the Levenshtein distance between them.
Can I use PyTorch to build a fast capitalization recoverer?
1 project | reddit.com/r/pytorch | 21 Nov 2022
Some tips for uploading full episodes to YouTube
1 project | reddit.com/r/podcasting | 9 Aug 2022
[P] Yandex open sources 100b large language model weights (YaLM)
2 projects | reddit.com/r/MachineLearning | 23 Jun 2022
Yet Another Voice Activity Detection Engine
1 project | reddit.com/r/speechrecognition | 27 Oct 2021
AttributeError: module 'nemo.collections' has no attribute 'nlp'
1 project | reddit.com/r/speechrecognition | 18 Oct 2021