Top 18 Python Distributed Projects
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.Project mention: JORLDY: OpenSource Reinforcement Learning Framework | reddit.com/r/reinforcementlearning | 2021-11-08
Distributed RL algorithms are provided using ray
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.Project mention: Automated Machine Learning (AutoML) - 9 Different Ways with Microsoft AI | dev.to | 2021-10-04
For a complete tutorial, navigate to this Jupyter Notebook: https://github.com/microsoft/nni/blob/master/examples/notebooks/tabular_data_classification_in_AML.ipynb
Run Linux Software Faster and Safer than Linux with Unikernels.
Modin: Speed up your Pandas workflows by changing a single line of codeProject mention: TIL about modin.pandas which significantly speeds up pandas if you import modin.pandas instead of pandas. | reddit.com/r/u_pygsm | 2021-06-30
A hyperparameter optimization frameworkProject mention: [P] optimization of Hugging Face Transformer models to get Inference < 1 Millisecond Latency + deployment on production ready inference server | reddit.com/r/MachineLearning | 2021-11-05
There are plenty of different options to do that in OSS, the most well known being optuna (https://github.com/optuna/optuna).
Dataset format for AI. Easily build and manage datasets for machine and deep learning. Stream data real-time & version-control it. https://activeloop.ai (by activeloopai)Project mention: TileDB VS Activeloop hub - a user suggested alternative | libhunt.com/r/TileDB | 2021-10-20
TensorFlow code for the neural network presented in the paper: "code2vec: Learning Distributed Representations of Code"Project mention: [D] Security feature labeled dataset for code2vec | reddit.com/r/MachineLearning | 2021-10-09
I am looking for a dataset that would contain code snippets (or vector representing it) and labels that are security specific features such as authentication, encryption, logging etc. I need to apply techniques like code2vec https://github.com/tech-srl/code2vec but with security-specific labels. Any leads where can I find this kind of dataset?
Scout APM: A developer's best friend. Try free for 14-days. Scout APM uses tracing logic that ties bottlenecks to source code so you know the exact line of code causing performance issues and can get back to building a great product faster.
Bagua Speeds up PyTorchProject mention: Bagua: Speed up and Scale PyTorch (r/MachineLearning) | reddit.com/r/datascienceproject | 2021-10-16
Redis for humans. 🌎🌍🌏Project mention: Worth wrapping pottery functions for compliance with async? | reddit.com/r/Python | 2021-08-01
I have a question about https://github.com/brainix/pottery. It provides a nice Pythonic API by wrapping Redis constructs with Python Redis-backed data structures (Dict, Deque, etc.). I am using it in a Fastapi microservice project, which is obviously async.
Erlang node implemented in Python 3.5+ (Asyncio-based)Project mention: Ask HN: Is Elixir Still Relevant? | news.ycombinator.com | 2021-04-10
- Python: https://github.com/Pyrlang/Pyrlang
Simple, fast, and parallelized symbolic regression in Python/Julia via regularized evolution and simulated annealingProject mention: [D] Inferring general physical laws from observations in 300 lines of code | reddit.com/r/MachineLearning | 2021-08-02
This is really neat! Since you're interested in this subject, you may also appreciate PySR and the corresponding paper which uses Graph Neural Networks to perform symbolic regression.
A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask without any rewrites. (by fugue-project)Project mention: FugueSQL: SQL-ish for pandas, dask, spark | news.ycombinator.com | 2021-10-11
Hey, I am the author of Fugue.
Fugue is a higher level abstraction compared to Ray. It provides unified and non-invasive interfaces for people to use Spark, Dask and Pandas. Ray/Modin is also on our roadmap.
It provides both Python interface (not pandas-like) and Fugue SQL (standard SQL + extra features). Users can choose the one they are most comfortable with as the semantic layer for distributed computing, they are equivalent.
With Fugue, most of your logic will be in simple Python/SQL that is framework and scale agnostic. From the mindset to the code, Fugue minimizes your dependency on any specific computing frameworks including Fugue itself.
Please let me know if you want to learn more. our slack is in the README of the fugue repo
Fugue repo: https://github.com/fugue-project/fugue
Reinforcement learning library(framework) designed for PyTorch, implements DQN, DDPG, A2C, PPO, SAC, MADDPG, A3C, APEX, IMPALA ...Project mention: Best PyTorch RL library for doing research | reddit.com/r/reinforcementlearning | 2021-04-30
Machin is really nice, it is very easy to use and to try different things, although it’s developed by one person and maybe not appropriately tested yet.
A parallel framework for population-based multi-agent reinforcement learning.Project mention: MALib: A parallel framework for population-based multi-agent reinforcement learning | reddit.com/r/reinforcementlearning | 2021-07-23
Code for https://arxiv.org/abs/2106.07551 found: https://github.com/sjtu-marl/malib
Ignareo the Carillon, a web crawler/spider template of ultimate high concurrency built for leprechauns. Carillons as the best web spiders; Long live the golden years of leprechauns!Project mention: Web crawler/spider of ultimate concurrency packed as microservice nodes | news.ycombinator.com | 2021-10-22
Hazelcast IMDG Python ClientProject mention: Contribution to Hazelcast | reddit.com/r/Python | 2021-07-05
More code samples here: https://github.com/hazelcast/hazelcast-python-client/tree/master/examples
Lethean Virtual Private Network (VPN)Project mention: Lethean - VPN on Monero base | reddit.com/r/Monero | 2021-02-21
GitHub of the VPN software itself: https://github.com/LetheanMovement/lethean-vpn
A pure-Python KSUID implementationProject mention: Show HN: Hookdeck, an Infrastructure to Consume Webhooks | news.ycombinator.com | 2021-08-04
Python Distributed related posts
JORLDY: OpenSource Reinforcement Learning Framework
2 projects | reddit.com/r/reinforcementlearning | 8 Nov 2021
[P] optimization of Hugging Face Transformer models to get Inference < 1 Millisecond Latency + deployment on production ready inference server
3 projects | reddit.com/r/MachineLearning | 5 Nov 2021
Web crawler/spider of ultimate concurrency packed as microservice nodes
1 project | news.ycombinator.com | 22 Oct 2021
Bagua: Speed up and Scale PyTorch (r/MachineLearning)
1 project | reddit.com/r/datascienceproject | 16 Oct 2021
How to deploy a rllib-trained model?
3 projects | reddit.com/r/reinforcementlearning | 16 Oct 2021
Bagua: Speed up and Scale PyTorch with Rust
1 project | reddit.com/r/rust | 16 Oct 2021
[P] Bagua: Speed up and Scale PyTorch
1 project | reddit.com/r/MachineLearning | 16 Oct 2021
What are some of the best open-source Distributed projects in Python? This list will help you:
Are you hiring? Post a new remote job listing for free.