Top 8 Jupyter Notebook Audio Projects

awesome-python-applications

3 16,200 6.4 Jupyter Notebook

💿 Free software that works great, and also happens to be open-source Python.
digital_video_introduction

8 15,095 6.2 Jupyter Notebook

A hands-on introduction to video technology: image, video, codec (av1, vp9, h265) and more (ffmpeg encoding). Translations: 🇺🇸 🇨🇳 🇯🇵 🇮🇹 🇰🇷 🇷🇺 🇧🇷 🇪🇸

Project mention: Breakdown of AV1 Video Codec | news.ycombinator.com | 2023-12-25

There's a great introduction to video tech, including codecs, at https://github.com/leandromoreira/digital_video_introduction

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
ast

1 995 2.3 Jupyter Notebook

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer". (by YuanGongND)
SpecVQGAN

2 318 2.2 Jupyter Notebook

Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)

Project mention: Text-to-Audio Generation Using Instruction Tuned LLM and Latent Diffusion Model | news.ycombinator.com | 2023-04-28

Excellent. Some of the theory here goes back to Oct/2021 and beyond [1].
The riffusion.com [2] guys made this practical. Also, my video of high-level overview and examples [3].
1. SpecVQGAN: https://github.com/v-iashin/SpecVQGAN
2. Riffusion: ://www.riffusion.com/
3. Riffusion high-level overview: https://youtu.be/olkLVGcvib8

sudo_rm_rf

1 298 0.0 Jupyter Notebook

Code for SuDoRm-Rf networks for efficient audio source separation. SuDoRm-Rf stands for SUccessive DOwnsampling and Resampling of Multi-Resolution Features which enables a more efficient way of separating sources from mixtures.
BMT

1 220 2.9 Jupyter Notebook

Source code for "Bi-modal Transformer for Dense Video Captioning" (BMVC 2020)
vid2cleantxt

1 156 0.0 Jupyter Notebook

Python API & command-line tool to easily transcribe speech-based video files into clean text
InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
WOLOF-ASR-Wav2Vec2

2 12 0.0 Jupyter Notebook

Audio Preprocessing and finetuning of wav2vec2-large-xlsr model on AI4D Baamtu Datamation - Automatic Speech Recognition in WOLOF Data.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).