Python Dask

Open-source Python projects categorized as Dask

Top 23 Python Dask Projects

  • Dask

    Parallel computing with task scheduling

  • Project mention: The Distributed Tensor Algebra Compiler (2022) | news.ycombinator.com | 2023-06-15
  • ibis

    the portable Python dataframe library

  • Project mention: Show HN: Hashquery, a Python library for defining reusable analysis | news.ycombinator.com | 2024-04-23

    I really don't understand the appeal of dbt vs a proper programming language. The templating approach leads to massive spaghetti. I look forward to trying out something like Ibis [0]

    0: https://ibis-project.org/

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • xarray

    N-D labeled arrays and datasets in Python

  • stumpy

    STUMPY is a powerful and scalable Python library for modern time series analysis

  • Project mention: Stumpy: Matrix profile time series analysis | news.ycombinator.com | 2024-03-03
  • mars

    Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.

  • swifter

    A package which efficiently applies any function to a pandas dataframe or series in the fastest available manner (by jmcarpenter2)

  • fugue

    A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.

  • Project mention: FLaNK Stack Weekly 22 January 2024 | dev.to | 2024-01-22
  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • distributed

    A distributed task scheduler for Dask

  • Optimus

    :truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark (by ironmussa)

  • Eliot

    Eliot: the logging system that tells you *why* it happened

  • mlforecast

    Scalable machine 🤖 learning for time series forecasting.

  • Project mention: Sales forecast for next two years | /r/datascience | 2023-06-25

    MLForecast

  • pystore

    Fast data store for Pandas time-series data

  • dask-sql

    Distributed SQL Engine in Python using Dask

  • Project mention: FLaNK Stack Weekly for 20 June 2023 | dev.to | 2023-06-20
  • nebari

    🪴 Nebari - your open source data science platform (by nebari-dev)

  • amazon-sagemaker-local-mode

    Amazon SageMaker Local Mode Examples

  • Project mention: Debugging Python Code in Amazon SageMaker Locally Using Visual Studio Code and PyCharm: A Step-by-Step Guide | dev.to | 2023-11-15

    git clone https://github.com/aws-samples/amazon-sagemaker-local-mode/ cd amazon-sagemaker-local-mode/general_pipeline_local_debug python3 -m venv .venv source .venv/bin/activate pip install jupyter jupyter lab

  • stackstac

    Turn a STAC catalog into a dask-based xarray

  • aicsimageio

    Image Reading, Metadata Conversion, and Image Writing for Microscopy Images in Python

  • xgboost_ray

    Distributed XGBoost on Ray

  • bytehub

    ByteHub: making feature stores simple

  • dask-awkward

    Native Dask collection for awkward arrays, and the library to use it.

  • dask-memusage

    A low-impact profiler to figure out how much memory each task in Dask is using

  • steam-data-engineering

    A data engineering project with Airflow, dbt, Terrafrom, GCP and much more!

  • pangeo-binder

    Pangeo + Binder (dev repo for a binder/pangeo fusion concept)

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Dask related posts

Index

What are some of the best open-source Dask projects in Python? This list will help you:

Project Stars
1 Dask 11,982
2 ibis 4,074
3 xarray 3,404
4 stumpy 2,984
5 mars 2,677
6 swifter 2,459
7 fugue 1,876
8 distributed 1,541
9 Optimus 1,446
10 Eliot 1,083
11 mlforecast 713
12 pystore 539
13 dask-sql 363
14 nebari 256
15 amazon-sagemaker-local-mode 228
16 stackstac 222
17 aicsimageio 192
18 xgboost_ray 131
19 bytehub 57
20 dask-awkward 56
21 dask-memusage 24
22 steam-data-engineering 20
23 pangeo-binder 18

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com