AugLy vs awkward

AugLy

A data augmentations library for audio, image, text, and video. (by facebookresearch)

Suggest topics

Source Code

ai.facebook.com

Suggest alternative

Edit details

awkward

Manipulate JSON-like data with NumPy-like idioms. (by scikit-hep)

JSON Numpy Data Analysis jagged-array ragged-array columnar-format Pandas numba apache-arrow cern-root scikit-hep Python rdataframe

Source Code

awkward-array.org

Suggest alternative

Edit details

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

AugLy		awkward
	Project
14	Mentions	4
4,898	Stars	792
0.5%	Growth	2.3%
6.0	Activity	9.6
28 days ago	Latest Commit	7 days ago
Python	Language	Python
GNU General Public License v3.0 or later	License	BSD 3-clause "New" or "Revised" License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

AugLy

Posts with mentions or reviews of AugLy. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-12-21.

Meta's A.I. exodus: Top talent quits as lab tries to keep pace with rivals
1 project | news.ycombinator.com | 1 Apr 2022

Their recent effort to generate training data for spotting stuff that includes unsanctioned narratives comes to mind. https://github.com/facebookresearch/AugLy
Next steps for after classification
1 project | /r/LanguageTechnology | 1 Jan 2022

Data augmentation is usually helpful: https://github.com/facebookresearch/AugLy
The hand-picked selection of the best Python libraries released in 2021
12 projects | /r/Python | 21 Dec 2021

AugLy.
Prefer volume or quality for BERT-based Text classification model
2 projects | /r/LanguageTechnology | 13 Dec 2021
Augly - An augmentation library for audio, image, video, and text from facebook
1 project | /r/DataCentricAI | 6 Dec 2021
[D] What's the best method to generate synthetic data for an image with text? Small dataset
3 projects | /r/MachineLearning | 13 Aug 2021
AugLy is opensourse now.
1 project | /r/technews | 28 Jun 2021
Facebook is open-sourcing AugLy, a library that uses data augmentations to evaluate and improve ML models
1 project | /r/neuralnetworks | 23 Jun 2021
Integration test: Complexity of privacy-preserving bird call bio-sensor for distributed ecological monitoring?
5 projects | /r/SingularityNet | 23 Jun 2021

Some of the technologies which could be integrated include differential privacy, distributed online machine learning, misinformation resilience and multi-party computation, all within the context of smart contracts and bioinformatics.
[N] Facebook AI Open Sources AugLy: A New Python Library For Data Augmentation To Develop Robust Machine Learning Models
5 projects | /r/MachineLearning | 19 Jun 2021

Facebook Blog: https://ai.facebook.com/blog/augly-a-new-data-augmentation-library-to-help-build-more-robust-ai-models/

awkward

Posts with mentions or reviews of awkward. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-07-03.

Efficient Jagged Arrays
2 projects | news.ycombinator.com | 3 Jul 2023

there's a whole ecosystem in Python originally developed for high energy physics data processing: https://github.com/scikit-hep/awkward all because Numpy demands square N-dimensional array
Same technique used everywhere, here's a simple Julia pkg for the same thing: https://github.com/JuliaArrays/ArraysOfArrays.jl/blob/3a6f5b...
But Julia at least has the decency to just support ragged Vector{Vector} out of the box, and it's not that slow
The hand-picked selection of the best Python libraries released in 2021
12 projects | /r/Python | 21 Dec 2021

Awkward Array.
Awkward: Nested, jagged, differentiable, mixed type, GPU-enabled, JIT'd NumPy
5 projects | news.ycombinator.com | 16 Dec 2021

Numba's @vectorize decorator (https://numba.pydata.org/numba-doc/latest/user/vectorize.htm...) makes a ufunc, and Awkward Array knows how to implicitly map ufuncs. (It is necessary to specify the signature in the @vectorize argument; otherwise, it won't be a true ufunc and Awkward won't recognize it.)
When Numba's JIT encounters a ctypes function, it goes to the ABI source and inserts a function pointer in the LLVM IR that it's generating. Unfortunately, that means that there is function-pointer indirection on each call, and whether that matters depends on how long-running the function is. If you mean that your assembly function is 0.1 ns per call or something, then yes, that function-pointer indirection is going to be the bottleneck. If you mean that your assembly function is 1 μs per call and that's fast, given what it does, then I think it would be alright.
If you need to remove the function-pointer indirection and still run on Awkward Arrays, there are other things we can do, but they're more involved. Ping me in a GitHub Issue or Discussion on https://github.com/scikit-hep/awkward-1.0

What are some alternatives?

When comparing AugLy and awkward you can also consider the following projects:

imgaug - Image augmentation for machine learning experiments.

sqlmodel - SQL databases in Python, designed for simplicity, compatibility, and robustness.

speechbrain - A PyTorch-based Speech Toolkit

DearPyGui - Dear PyGui: A fast and powerful Graphical User Interface Toolkit for Python with minimal dependencies

PySyft - Perform data science on data that remains in someone else's server

uproot5 - ROOT I/O in pure Python and NumPy.

BlenderProc - A procedural Blender pipeline for photorealistic training image generation

django-ninja - 💨 Fast, Async-ready, Openapi, type hints based framework for building APIs

Activeloop Hub - Data Lake for Deep Learning. Build, manage, query, version, & visualize datasets. Stream data real-time to PyTorch/TensorFlow. https://activeloop.ai [Moved to: https://github.com/activeloopai/deeplake]

numba-dpex - Data Parallel Extension for Numba

evidently - Evaluate and monitor ML models from validation to production. Join our Discord: https://discord.com/invite/xZjKRaNp8b

skweak - skweak: A software toolkit for weak supervision applied to NLP tasks

AugLy vs imgaug awkward vs sqlmodel AugLy vs speechbrain awkward vs DearPyGui AugLy vs PySyft awkward vs uproot5 AugLy vs BlenderProc awkward vs django-ninja AugLy vs Activeloop Hub awkward vs numba-dpex AugLy vs evidently awkward vs skweak

Compare AugLy vs awkward and see what are their differences.

AugLy

awkward

AugLy

awkward

What are some alternatives?