Shuffling large data at constant memory in Dask

This page summarizes the projects mentioned and recommended in the original post on /r/Python

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • distributed

    A distributed task scheduler for Dask

  • Thanks, if you give it a try, you can share your experience in this GitHub issue, where developers are collecting info for further improvements. https://github.com/dask/distributed/discussions/7509

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Great forward progress on squashing cluster deadlocks

    1 project | /r/dask | 15 Dec 2021
  • Ask HN: What piece of code/codebase blew your mind when you saw it?

    17 projects | news.ycombinator.com | 31 Oct 2022
  • Is Numpy always more efficient than Pandas? And how much should we rely on Python anyway?

    1 project | /r/datascience | 10 Dec 2021
  • Ask HN: Is PySPark a Dead-End?

    1 project | news.ycombinator.com | 5 Dec 2021
  • How to load 85.6 GB of XML data into a dataframe

    1 project | /r/pythontips | 1 Dec 2021