speechbrain
imgaug
Our great sponsors
speechbrain | imgaug | |
---|---|---|
26 | 7 | |
7,836 | 14,117 | |
6.8% | - | |
9.8 | 0.0 | |
4 days ago | 13 days ago | |
Python | Python | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
speechbrain
- SpeechBrain 1.0: A free and open-source AI toolkit for all things speech
- FLaNK Stack Weekly 22 January 2024
-
[D] Training ASR model using SpeechBrain
You likely have a very broken sample in one of your batches. It looks like your training actually went through a few batches before it horked the error at you. A quick google shows a similar issue in the github repo: https://github.com/speechbrain/speechbrain/issues/649 .
-
Whisper.cpp
https://github.com/ggerganov/whisper.cpp https://speechbrain.github.io/
-
[D] What is the best open source text to speech model?
I don't know if it's the best, but Speechbrain is supposed to be state of the art.
-
[D] What's stopping you from working on speech and voice?
- https://github.com/speechbrain/speechbrain
- Specific Voice recognition
- How to get high-quality, low-cost Speech-to-Text transcription?
- [D] Speech Enhancement SOTA
- Speaker diarization
imgaug
- How to label augmented images for training YOLO algorithm?
-
Improve Your Deep Learning Models with Image Augmentation
There are many good options when it comes to tools and libraries for implementing data augmentation into our deep learning pipeline. You could for instance do your own augmentations using NumPy or Pillow. Some of the most popular dedicated libraries for image augmentation include Albumentations, imgaug, and Augmentor. Both TensorFlow and PyTorch even come with their own packages dedicated to image augmentation.
-
[N] Facebook AI Open Sources AugLy: A New Python Library For Data Augmentation To Develop Robust Machine Learning Models
https://github.com/aleju/imgaug This one is way better for image.
-
[UPDATE!] Recognize trinkets with Isaac Item Recognizer! And also a few useful features in my newest update.
I have to improve my dataset with more backgrounds featuring obstacles. At the moment I'm working on creating a dataset with both items and trinkets, and I'm planning on using https://github.com/aleju/imgaug which will replace most of the stuff I'm doing with PIL.
-
Support creation of tf.data.Dataset (data generator) and augmentation for image.
Do you acknowledge that there is ImageDataGenerator and ImgAug?
-
[P] Albumentations 1.0 is released (a Python library for image augmentation)
Albumentations no longer uses the imgaug library by default. All previous imgaug augmentations in the library are reimplemented in Albumentations with the same API (but you can still install Albumentations with imgaug if you need the old augmentations).
-
Bounding boxes do not completely wrap the objects with YOLOv4
I would also recommend you to give a try to TensorFlow Object Detection Model - https://github.com/tensorflow/models/tree/master/research/object_detection with augmentation - https://github.com/aleju/imgaug pipeline. The same worked for me in a similar use case where I had to localise logo on documents.
What are some alternatives?
espnet - End-to-End Speech Processing Toolkit
albumentations - Fast image augmentation library and an easy-to-use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
pyannote-audio - Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
YOLO-Mosaic - Perform mosaic image augmentation on data for training a YOLO model
Resemblyzer - A python package to analyze and compare voices with deep learning
tensorflow - An Open Source Machine Learning Framework for Everyone
ukrainian-onnx-model - An ONNX model for speech recognition of the Ukrainian language
AugLy - A data augmentations library for audio, image, text, and video.
SincNet - SincNet is a neural architecture for efficiently processing raw audio samples.
tfaug - tensorflow easy image augmentation package
speech-to-text-benchmark - speech to text benchmark framework
autoalbument - AutoML for image augmentation. AutoAlbument uses the Faster AutoAugment algorithm to find optimal augmentation policies. Documentation - https://albumentations.ai/docs/autoalbument/