SincNet
SALMONN
SincNet | SALMONN | |
---|---|---|
3 | 2 | |
1,097 | 803 | |
- | 5.1% | |
0.0 | 7.4 | |
about 3 years ago | 24 days ago | |
Python | Python | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
SincNet
- Does this SincNet (neural architecture) contain a discriminator?
-
TypeError: layer_norm(): argument 'input' (position 1) must be Tensor, not SincNet.
the sincnet class is taken from here: https://github.com/mravanelli/SincNet/blob/master/dnn_models.py
-
[R][P] Announcing audax, a audio ML/DL framework in Jax
Code for https://arxiv.org/abs/1808.00158 found: https://github.com/mravanelli/SincNet
SALMONN
-
Comparing Humans, GPT-4, and GPT-4V on Abstraction and Reasoning Tasks
> In other words, if you express a problem in a more complicated space (e.g. a visual problem, or an abstract algebra problem), you will not be able to solve it in the smaller token space, there's not enough information
You're aware multimodel transformers do exactly this?
https://github.com/bytedance/SALMONN
- Salmonn: Speech Audio Language Music Open Neural Network
What are some alternatives?
pyannote-audio - Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
gensound - Pythonic audio processing and generation framework
speechbrain - A PyTorch-based Speech Toolkit
thunder-speech - A Hackable speech recognition library.
cnnimageretrieval-pytorch - CNN Image Retrieval in PyTorch: Training and evaluating CNNs for Image Retrieval in PyTorch
lhotse - Tools for handling speech data in machine learning projects.
Image-Forgery-Detection-CNN - Image forgery detection using convolutional neural networks. Group 10's final project for TU Delft's course CS4180 Deep Learning 2019.
ruptures - ruptures: change point detection in Python
UHV-OTS-Speech - A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.
stereo-image-generation - This repository contains code to generate stereo (Side by side) image from a single image.
whisper-timestamped - Multilingual Automatic Speech Recognition with word-level timestamps and confidence