SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Python Dask Projects
-
From what I've seen, there are sort of two paths. I'll provide a well known example from each.
1. lang specific distributed task library
For example, in Python, celery is a pretty popular task system. If you (the dev) are the one doing all the code and running the workflows, it might work well for you. You build the core code and functions, and it handles the processing and resource stuff with a little config.
* https://github.com/celery/celery
Or lower level:
* https://github.com/dask/dask
2. DAG Workflow systems
There are also whole systems for what you're describing. They've gotten especially popular in the ML ops and data engineering world. A common one is AirFlow:
* https://github.com/apache/airflow
-
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
Project mention: Stumpy: Python library to computing matrix profiles on timeseries | news.ycombinator.com | 2025-03-19
-
-
mars
Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.
-
swifter
A package which efficiently applies any function to a pandas dataframe or series in the fastest available manner (by jmcarpenter2)
-
fugue
A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
Optimus
:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark (by ironmussa)
-
-
-
Project mention: Narwhals: Lightweight and extensible compatibility layer between dataframe libs | news.ycombinator.com | 2024-08-29
-
-
-
-
-
-
-
-
-
-
-
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python Dask discussion
Python Dask related posts
-
Stumpy: Python library to computing matrix profiles on timeseries
-
State of Python 3.13 Performance: Free-Threading
-
Farewell Pandas, and thanks for all the fish
-
Powerful and scalable Python library for modern time series analysis
-
TDAmeritrade: Timeseries Analysis with Stumpy
-
Stumpy: Matrix profile time series analysis
-
Shuffling large data at constant memory in Dask
-
A note from our sponsor - SaaSHub
www.saashub.com | 16 May 2025
Index
What are some of the best open-source Dask projects in Python? This list will help you:
# | Project | Stars |
---|---|---|
1 | Dask | 13,186 |
2 | stumpy | 3,912 |
3 | xarray | 3,788 |
4 | mars | 2,718 |
5 | swifter | 2,585 |
6 | fugue | 2,081 |
7 | distributed | 1,627 |
8 | Optimus | 1,511 |
9 | Eliot | 1,144 |
10 | mlforecast | 1,018 |
11 | narwhals | 984 |
12 | pystore | 577 |
13 | datacompy | 566 |
14 | dask-sql | 404 |
15 | nebari | 294 |
16 | stackstac | 256 |
17 | amazon-sagemaker-local-mode | 256 |
18 | aicsimageio | 213 |
19 | xgboost_ray | 148 |
20 | dask-awkward | 64 |
21 | bytehub | 60 |
22 | dask-memusage | 24 |
23 | steam-data-engineering | 24 |