|over 4 years ago||6 months ago|
|Apache License 2.0||GNU General Public License v3.0 or later|
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
We haven't tracked posts mentioning dumbo yet.
Tracking mentions began in Dec 2020.
What are some alternatives?
mrjob - Run MapReduce jobs on Hadoop or Amazon Web Services
Apache Spark - Apache Spark - A unified analytics engine for large-scale data processing
manjaro-linux - Shell scripts for setting up Manjaro Linux for Python programming and deep learning
BirdNET - Soundscape analysis with BirdNET.
streamparse - Run Python in Apache Storm topologies. Pythonic API, CLI tooling, and a topology DSL.
dpark - Python clone of Spark, a MapReduce alike framework in Python
luigi - Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask without any rewrites.
pandas_flavor - The easy way to write your own flavor of Pandas
PMapper - A tool for quickly evaluating IAM permissions in AWS.
einops - Deep learning operations reinvented (for pytorch, tensorflow, jax and others)
kmodes - Python implementations of the k-modes and k-prototypes clustering algorithms, for clustering categorical data