Python Distributed

Open-source Python projects categorized as Distributed | Edit details

Top 21 Python Distributed Projects

  • GitHub repo Ray

    An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

    Project mention: Is it normal to have a negative and near-zero explained variance in PPO? | | 2021-12-25

    I guess I did, as I directly use the PPO agent provided by the RLlib.

  • GitHub repo nni

    An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

    Project mention: Automated Machine Learning (AutoML) - 9 Different Ways with Microsoft AI | | 2021-10-04

    For a complete tutorial, navigate to this Jupyter Notebook:

  • SonarLint

    Deliver Cleaner and Safer Code - Right in Your IDE of Choice!. SonarLint is a free and open source IDE extension that identifies and catches bugs and vulnerabilities as you code, directly in the IDE. Install from your favorite IDE marketplace today.

  • GitHub repo modin

    Modin: Speed up your Pandas workflows by changing a single line of code

    Project mention: TIL about modin.pandas which significantly speeds up pandas if you import modin.pandas instead of pandas. | | 2021-06-30


  • GitHub repo optuna

    A hyperparameter optimization framework

    Project mention: Trading Algos - 5 Key Metrics and How to Implement Them in Python | | 2022-01-08

    Nothing can beat iteration and rapid optimization. Try running things like grid experiments, batch optimizations, and parameter searches. Take a look at various packages like hyperopt or optuna as packages that might be able to help you here!

  • GitHub repo Gerapy

    Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Django and Vue.js

    Project mention: The Complete Guide To ScrapydWeb, Get Setup In 3 Minutes! | | 2022-01-13

    There are many different Scrapyd dashboard and admin tools available, from ScrapeOps (Live Demo) to SpiderKeeper, and Gerapy.

  • GitHub repo lingvo


  • GitHub repo arq

    Fast job queuing and RPC in python with asyncio and redis.

    Project mention: I made a simple async queueing framework called SAQ! It includes a built in web UI to manage jobs. | | 2022-01-06

    I need to process a lot of long running IO heavy jobs with background workers. I've been using ARQ for a while but decided to take a crack at writing my own distributed queue.

  • OPS

    OPS - Build and Run Open Source Unikernels. Quickly and easily build and deploy open source unikernels in tens of seconds. Deploy in any language to any cloud.

  • GitHub repo code2vec

    TensorFlow code for the neural network presented in the paper: "code2vec: Learning Distributed Representations of Code"

    Project mention: [D] Security feature labeled dataset for code2vec | | 2021-10-09

    I am looking for a dataset that would contain code snippets (or vector representing it) and labels that are security specific features such as authentication, encryption, logging etc. I need to apply techniques like code2vec but with security-specific labels. Any leads where can I find this kind of dataset?

  • GitHub repo bagua

    Bagua Speeds up PyTorch

    Project mention: Bagua: Speed up and Scale PyTorch (r/MachineLearning) | | 2021-10-16
  • GitHub repo pottery

    Redis for humans. 🌎🌍🌏

    Project mention: Worth wrapping pottery functions for compliance with async? | | 2021-08-01

    I have a question about It provides a nice Pythonic API by wrapping Redis constructs with Python Redis-backed data structures (Dict, Deque, etc.). I am using it in a Fastapi microservice project, which is obviously async.

  • GitHub repo Pyrlang

    Erlang node implemented in Python 3.5+ (Asyncio-based)

    Project mention: Ask HN: Is Elixir Still Relevant? | | 2021-04-10

    - Python:

  • GitHub repo fugue

    A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask without any rewrites.

    Project mention: Pyspark now provides a native Pandas API | | 2022-01-02

    There's dask-sql, but I think it is being abandoned for fugue-project. I'm actually excited for this project as it is trying to provide a backend agnostic solution, which would seem like a difficult, lofty goal. I wish them luck.

  • GitHub repo PySR

    Simple, fast, and parallelized symbolic regression in Python/Julia via regularized evolution and simulated annealing

    Project mention: [D] Inferring general physical laws from observations in 300 lines of code | | 2021-08-02

    This is really neat! Since you're interested in this subject, you may also appreciate PySR and the corresponding paper which uses Graph Neural Networks to perform symbolic regression.

  • GitHub repo machin

    Reinforcement learning library(framework) designed for PyTorch, implements DQN, DDPG, A2C, PPO, SAC, MADDPG, A3C, APEX, IMPALA ...

    Project mention: Best PyTorch RL library for doing research | | 2021-04-30

    Machin is really nice, it is very easy to use and to try different things, although it’s developed by one person and maybe not appropriately tested yet.

  • GitHub repo malib

    A parallel framework for population-based multi-agent reinforcement learning.

    Project mention: MALib: A parallel framework for population-based multi-agent reinforcement learning | | 2021-07-23

    Code for found:

  • GitHub repo lithops

    An open source framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud.

    Project mention: [D] For those of you who don't own a GPU, how do you run your experiments or train your models? | | 2021-12-19

    At work for non-ML/non-GPU stuff I've been using Lithops for running code on dynamically-provisioned cloud resources (serverless or VM). It pickles your code & runtime variables, sends them to cloud storage, runs the code & downloads the results, all relatively transparently. You're just calling Python functions with Python objects on your local computer and not having to worry about deploying your code, packaging your data, etc. Better still, you can scale up for things like hyperparameter sweeps by just dispatching more calls in parallel, and it will provision more resources.

  • GitHub repo Ignareo-ISML-auto-voter

    Ignareo the Carillon, a web crawler/spider template of ultimate high concurrency built for leprechauns. Carillons as the best web spiders; Long live the golden years of leprechauns!

    Project mention: Web crawler/spider of ultimate concurrency packed as microservice nodes | | 2021-10-22
  • GitHub repo optuna-examples

    Examples for

    Project mention: Data Scientists are dying out | | 2022-01-18

    That's still regular ML because you are in charge of the features. Optuna might make your life easier though:

  • GitHub repo hazelcast-python-client

    Hazelcast Python Client

    Project mention: Contribution to Hazelcast | | 2021-07-05

    More code samples here:

  • GitHub repo lethean-vpn

    This WONT work today, January it will.

    Project mention: Lethean - VPN on Monero base | | 2021-02-21

    GitHub of the VPN software itself:

  • GitHub repo python-ksuid

    A pure-Python KSUID implementation

    Project mention: Show HN: Hookdeck, an Infrastructure to Consume Webhooks | | 2021-08-04
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2022-01-18.

Python Distributed related posts


What are some of the best open-source Distributed projects in Python? This list will help you:

Project Stars
1 Ray 18,887
2 nni 10,884
3 modin 6,704
4 optuna 5,808
5 Gerapy 2,635
6 lingvo 2,371
7 arq 1,072
8 code2vec 825
9 bagua 634
10 pottery 610
11 Pyrlang 489
12 fugue 475
13 PySR 467
14 machin 262
15 malib 250
16 lithops 181
17 Ignareo-ISML-auto-voter 165
18 optuna-examples 134
19 hazelcast-python-client 97
20 lethean-vpn 31
21 python-ksuid 31
Find remote jobs at our new job board There are 29 new remote jobs listed recently.
Are you hiring? Post a new remote job listing for free.
Less time debugging, more time building
Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.