vid2cleantxt
machinehearing
vid2cleantxt | machinehearing | |
---|---|---|
1 | 2 | |
156 | 223 | |
- | - | |
0.0 | 6.8 | |
over 1 year ago | 1 day ago | |
Jupyter Notebook | Jupyter Notebook | |
Apache License 2.0 | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
vid2cleantxt
machinehearing
-
Zimtohrli: A New Psychoacoustic Perceptual Metric for Audio Compression
PEAQ/PESQ and visqol is worth trying for that. In principle they operate as you suggest. I keep a short overview of audio quality methods/tools here: https://github.com/jonnor/machinehearing/blob/master/audio-q...
-
[P] Mel Frequency Cepstral Coefficients Transformation
I made a notebook that illustrates the distributions of MFCC values here: https://github.com/jonnor/machinehearing/blob/master/handson/quantized-mfcc/MFCC-Spectrogram-Shifts.ipynb
What are some alternatives?
SpecVQGAN - Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)
hackmd - CodiMD - Realtime collaborative markdown notes on all platforms. [Moved to: https://github.com/hackmdio/codimd]
PipeWire-Guide - PipeWire Guide. Learn about how PipeWire gives your Linux system a Professional Audio/Video Processing workflow.
steerable-nafx - Steerable discovery of neural audio effects
distil-whisper - Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
AudioInsightsGenerator - Unlock AI power with AudioInsightsGenerator! From audio to summaries, emotion analysis, idea generation, narratives, and content filtering. Explore your audio's hidden dimensions!
web-whisper - OpenAI's Whisper Audio to text transcription right into your web browser! An open source AI subtitling suite.
SRMIST-B.Tech-ECE-Notes-2022-24 - Collection of all B.Tech ECE Notes for the academic year 2020-24.
WOLOF-ASR-Wav2Vec2 - Audio Preprocessing and finetuning of wav2vec2-large-xlsr model on AI4D Baamtu Datamation - Automatic Speech Recognition in WOLOF Data.
fibs-reporter - Automatically generate a pdf report containing feature importance, baseline modelling, spurious correlation detection, and more, from a single command line input for any given ML CSV file
silero-models - Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
cs231n - Note and Assignments for CS231n: Convolutional Neural Networks for Visual Recognition