Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work. Learn more →
Top 23 Python Dask Projects
-
-
PyTorch and JAX are used heavily in climate science on the ML side. For more general analytics, not so much. Many of our users like to use Xarray as a high-level API. There has been some work to integrate Xarray with PyTorch (https://github.com/pydata/xarray/issues/3232) but we're not there yet.
The Python Array API standard should help align these different back-ends: https://data-apis.org/array-api/latest/
-
Mergify
Updating dependencies is time-consuming.. Solutions like Dependabot or Renovate update but don't merge dependencies. You need to do it manually while it could be fully automated! Add a Merge Queue to your workflow and stop caring about PR management & merging. Try Mergify for free.
-
Ibis could also be a target. It compiles queries written in python to multiple dataframe libraries, and SQL targets.
-
-
mars
Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.
-
swifter
A package which efficiently applies any function to a pandas dataframe or series in the fastest available manner (by jmcarpenter2)
-
fugue
A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.
Project mention: Daft: A High-Performance Distributed Dataframe Library for Multimodal Data | news.ycombinator.com | 2023-06-07Please integrate it with Fugue.
-
Sonar
Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.
-
Thanks, if you give it a try, you can share your experience in this GitHub issue, where developers are collecting info for further improvements. https://github.com/dask/distributed/discussions/7509
-
Optimus
:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark (by ironmussa)
-
Maybe something like eliot could work for you
-
MLForecast
-
-
-
Have you seen Nebari?
-
Project mention: Can you replace Geoserver with COG and MVT from a bucket? | /r/geospatial | 2023-03-12
Like they're doing here to access sentinel 2 images https://github.com/gjoseph92/stackstac
-
-
-
-
-
-
-
Project mention: Feedback for my project about Steam games data, featuring Terraform, Airflow, dbt, spark, dataproc, Bigquery, S3, etc | /r/dataengineering | 2022-09-30
Here is the GH repo: https://github.com/VicenteYago/steam-data-engineering with more detailed info.
-
-
InfluxDB
Collect and Analyze Billions of Data Points in Real Time. Manage all types of time series data in a single, purpose-built database. Run at any scale in any environment in the cloud, on-premises, or at the edge.
Python Dask related posts
- Shuffling large data at constant memory in Dask
- Fugue: A unified interface for distributed computing
- [Discussion] Open Source beats Google's AutoML for Time series
- File format for large data with many columns
- Time Series Analysis for air pollution data not aligned [R] [P]
- What is the best way to save a csv.file in number only ? PC hangs when my file is more than 2GB
- [D] STUMPY v1.11.0 Released for Modern Time Series Analysis
-
A note from our sponsor - Sonar
www.sonarsource.com | 26 Sep 2023
Index
What are some of the best open-source Dask projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | Dask | 11,398 |
2 | xarray | 3,155 |
3 | ibis | 3,110 |
4 | stumpy | 2,781 |
5 | mars | 2,642 |
6 | swifter | 2,343 |
7 | fugue | 1,723 |
8 | distributed | 1,489 |
9 | Optimus | 1,406 |
10 | Eliot | 1,046 |
11 | mlforecast | 511 |
12 | pystore | 497 |
13 | dask-sql | 326 |
14 | nebari | 226 |
15 | stackstac | 195 |
16 | aicsimageio | 168 |
17 | xgboost_ray | 116 |
18 | bytehub | 56 |
19 | dask-awkward | 48 |
20 | dask-memusage | 24 |
21 | pangeo-binder | 18 |
22 | steam-data-engineering | 15 |
23 | pythonic | 9 |