Python Distributed Computing

Open-source Python projects categorized as Distributed Computing

Top 23 Python Distributed Computing Projects

Distributed Computing
  1. ColossalAI

    Making large AI models cheaper, faster and more accessible

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. catalyst

    Accelerated deep learning R&D (by catalyst-team)

  4. rl

    A modular, primitive-first, python-first PyTorch library for Reinforcement Learning. (by pytorch)

  5. fugue

    A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.

  6. distributed

    A distributed task scheduler for Dask

  7. vizier

    Python-based research interface for blackbox and hyperparameter optimization, based on the internal Google Vizier Service.

  8. AI-Horde

    A crowdsourced distributed cluster for AI art and text generation

  9. Sevalla

    Deploy and host your apps and databases, now with $50 credit! Sevalla is the PaaS you have been looking for! Advanced deployment pipelines, usage-based pricing, preview apps, templates, human support by developers, and much more!

    Sevalla logo
  10. couler

    Unified Interface for Constructing and Managing Workflows on different workflow engines, such as Argo Workflows, Tekton Pipelines, and Apache Airflow.

  11. bagua

    Bagua Speeds up PyTorch

  12. openfederatedlearning

    An Open Framework for Federated Learning.

  13. tdigest

    t-Digest data structure in Python. Useful for percentiles and quantiles, including distributed enviroments like PySpark (by CamDavidsonPilon)

  14. machinaris

    An easy-to-use WebUI for crypto plotting and farming. Offers Bladebit, Gigahorse, MadMax, Chiadog and Plotman in a Docker container. Supports Chia, MMX, Chives, Flax, and HDDCoin among others.

  15. sparktorch

    Train and run Pytorch models on Apache Spark.

  16. arkouda

    Arkouda (αρκούδα): Interactive Data Analytics at Supercomputing Scale :bear:

  17. stable-diffusion-webui-distributed

    Chains stable-diffusion-webui instances together to facilitate faster image generation.

  18. wrapyfi

    Robotics MOM and RPC middleware wrapper with deep-learning framework integration

  19. mlToolKits

    learningOrchestra is a distributed Machine Learning integration tool that facilitates and streamlines iterative processes in a Data Science project.

  20. redis-dict

    Python dictionary with Redis as backend, built for large datasets. Simplifies Redis operations for large-scale and distributed systems. Supports various data types, namespacing, pipelining, and expiration.

    Project mention: Show HN: RedisDict | news.ycombinator.com | 2024-11-05

    It handles types without Pickle since remote pickled data is unsafe. Built for working with large datasets, it implements the full dictionary interface with extensive test coverage.

    GitHub: https://github.com/Attumm/redis-dict

  21. tune

    An abstraction layer for parameter tuning (by fugue-project)

  22. FindTheMag2

    A tool to determine optimal projects for Gridcoin & BOINC crunchers. Maximize your magnitude!

  23. rxray

    Ray distributed computing integration for RxPY

  24. py-inventa

    A Python library for microservice registry and executing RPC (Remote Procedure Call) over Redis.

  25. hulse-py

    The Python client for the Hulse platform

  26. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Distributed Computing discussion

Log in or Post with

Python Distributed Computing related posts

  • Show HN: RedisDict

    1 project | news.ycombinator.com | 5 Nov 2024
  • Show HN: Interactive Graph by LLM (GPT-4o)

    5 projects | news.ycombinator.com | 19 May 2024
  • Daft: A High-Performance Distributed Dataframe Library for Multimodal Data

    4 projects | news.ycombinator.com | 7 Jun 2023
  • about making a game.

    2 projects | /r/ArtificialInteligence | 6 Jun 2023
  • TIL : about the game "Foldit", a puzzle game about protein folding. In 2011, its gamers helped decipher a protein of a HIV-like virus, solving a scientific problem that went unsolved for 15 years in as little as 10 days.

    5 projects | /r/todayilearned | 22 May 2023
  • Alternatives to Kaggle and Collab?

    1 project | /r/StableDiffusion | 25 Apr 2023
  • Shuffling large data at constant memory in Dask

    1 project | /r/Python | 17 Apr 2023
  • A note from our sponsor - Sevalla
    sevalla.com | 2 Sep 2025
    Sevalla is the PaaS you have been looking for! Advanced deployment pipelines, usage-based pricing, preview apps, templates, human support by developers, and much more! Learn more →

Index

What are some of the best open-source Distributed Computing projects in Python? This list will help you:

# Project Stars
1 ColossalAI 41,121
2 catalyst 3,354
3 rl 3,025
4 fugue 2,105
5 distributed 1,648
6 vizier 1,592
7 AI-Horde 1,272
8 couler 940
9 bagua 882
10 openfederatedlearning 800
11 tdigest 398
12 machinaris 344
13 sparktorch 339
14 arkouda 270
15 stable-diffusion-webui-distributed 183
16 wrapyfi 77
17 mlToolKits 76
18 redis-dict 74
19 tune 35
20 FindTheMag2 33
21 rxray 12
22 py-inventa 9
23 hulse-py 6

Sponsored
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com