Python Distributed

Open-source Python projects categorized as Distributed

Top 23 Python Distributed Projects

  • Ray

    Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

    Project mention: Open Source Advent Fun Wraps Up! | dev.to | 2024-01-05

    22. Ray | Github | tutorial

  • nni

    An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • optuna

    A hyperparameter optimization framework

    Project mention: Optuna – A Hyperparameter Optimization Framework | news.ycombinator.com | 2024-04-06

    I didn’t even know WandB did hyperparameter optimization, I figured it was a neural network visualizer based on 2 minute papers. Didn’t seem like many alternatives out there to Optuna with TPE + persistence in conditional continuous & discrete spaces.

    Anyway, it’s doable to make a multi objective decide_to_prune function with Optuna, here’s an example https://github.com/optuna/optuna/issues/3450#issuecomment-19...

  • modin

    Modin: Scale your Pandas workflows by changing a single line of code

    Project mention: The Distributed Tensor Algebra Compiler (2022) | news.ycombinator.com | 2023-06-15
  • scrapy-redis

    Redis-based components for Scrapy.

    Project mention: How to make scrapy run multiple times on the same URLs? | /r/scrapy | 2023-06-26
  • Gerapy

    Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Django and Vue.js

  • lingvo

    Lingvo

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • hatchet

    A distributed, fault-tolerant task queue

    Project mention: Ask HN: Who is hiring? (April 2024) | news.ycombinator.com | 2024-04-01

    Hatchet (https://hatchet.run) | New York City | Full-time

    We're hiring a founding engineer to help us with development on our open-source, distributed task queue: https://github.com/hatchet-dev/hatchet.

    We recently launched on HN, you can check out our launch here: https://news.ycombinator.com/item?id=39643136. We're two second-time YC founders in this for the long haul and we are just wrapping up the YC W24 batch.

    As a founding engineer, you'll be responsible for contributing across the entire codebase. We'll compensate accordingly and with high equity. It's currently just the two founders + a part-time contractor. We're all technical and contribute code.

    Stack: Typescript/React, Go and PostgreSQL.

    To apply, email alexander [at] hatchet [dot] run, and include the following:

    1. Tell us about something impressive you've built.

    2. Ask a question or write a comment about the state of the project. For example: a file that stood out to you in the codebase, a Github issue or discussion that piqued your interest, a general comment on distributed systems/task queues, or why our code is bad and how you could improve it.

  • arq

    Fast job queuing and RPC in python with asyncio and redis.

    Project mention: Future Plan for Arq | news.ycombinator.com | 2024-03-18
  • fugue

    A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.

    Project mention: FLaNK Stack Weekly 22 January 2024 | dev.to | 2024-01-22
  • PySR

    High-Performance Symbolic Regression in Python and Julia

    Project mention: Potential of the Julia programming language for high energy physics computing | news.ycombinator.com | 2023-12-04

    > Yes, julia can be called from other languages rather easily

    This seems false to me. StaticCompiler.jl [1] puts in their limitations that "GC-tracked allocations and global variables do not work with compile_executable or compile_shlib. This has some interesting consequences, including that all functions within the function you want to compile must either be inlined or return only native types (otherwise Julia would have to allocate a place to put the results, which will fail)." PackageCompiler.jl [2] has the same limitations if I'm not mistaken. So then you have to fall back to distributing the Julia "binary" with a full Julia runtime, which is pretty heavy. There are some packages which do this. For example, PySR [3] does this.

    There is some word going around though that there is an even better static compiler in the making, but as long as that one is not publicly available I'd say that Julia cannot easily be called from other languages.

    [1]: https://github.com/tshort/StaticCompiler.jl

    [2]: https://github.com/JuliaLang/PackageCompiler.jl

    [3]: https://github.com/MilesCranmer/PySR

  • MLBox

    MLBox is a powerful Automated Machine Learning python library.

  • quokka

    Making data lake work for time series (by marsupialtail)

    Project mention: How Query Engines Work | news.ycombinator.com | 2023-09-08

    An awesome read!

    Something related that I found out about from HN a few months back is another engine called quokka. It's particularly interesting and applicable how quokka schedules distributed queries to outperform Spark https://github.com/marsupialtail/quokka/blob/master/blog/why...

  • code2vec

    TensorFlow code for the neural network presented in the paper: "code2vec: Learning Distributed Representations of Code"

    Project mention: Word2vec | news.ycombinator.com | 2023-10-09
  • pottery

    Redis for humans. 🌎🌍🌏

    Project mention: Is Redis om production ready? Or will it be production ready anytime soon? | /r/redis | 2023-05-12

    However, as an alternative, consider my library, Pottery. Pottery offers some similar functionality to Redis OM, and Pottery is production ready.

  • evotorch

    Advanced evolutionary computation library built directly on top of PyTorch, created at NNAISENSE.

  • bagua

    Bagua Speeds up PyTorch

  • runhouse

    The fastest way to iterate and deploy AI workloads on your own infra. Unobtrusive, debuggable, PyTorch-like APIs.

    Project mention: Better GPU Cluster Scheduling with Runhouse | dev.to | 2024-03-15

    With Runhouse, it’s easy to send code to your compute no matter where it lives, and efficiently utilize your resources across multiple callers scheduling jobs (e.g. researchers, pipelines, inference services, etc). We believe less is more when it comes to AI DevOps, so we don’t make any assumptions about the structure of your code or the infrastructure to which you’re sending it.

  • optuna-examples

    Examples for https://github.com/optuna/optuna

  • Pyrlang

    Erlang node implemented in Python 3.5+ (Asyncio-based)

  • wakaq

    Background task queue for Python backed by Redis, a super minimal Celery

    Project mention: Show HN: Hatchet – Open-source distributed task queue | news.ycombinator.com | 2024-03-08
  • AgileRL

    Streamlining reinforcement learning with RLOps. State-of-the-art RL algorithms and tools.

    Project mention: [P] Introducing PPO and Rainbow DQN to our super fast evolutionary HPO reinforcement learning framework | /r/MachineLearning | 2023-10-15
  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2024-04-06.

Python Distributed related posts

Index

What are some of the best open-source Distributed projects in Python? This list will help you:

Project Stars
1 Ray 30,879
2 nni 13,708
3 optuna 9,615
4 modin 9,465
5 scrapy-redis 5,447
6 Gerapy 3,205
7 lingvo 2,781
8 hatchet 2,683
9 arq 1,902
10 fugue 1,869
11 PySR 1,850
12 MLBox 1,474
13 quokka 1,081
14 code2vec 1,072
15 pottery 1,002
16 evotorch 967
17 bagua 865
18 runhouse 702
19 optuna-examples 587
20 Pyrlang 586
21 wakaq 563
22 modal-examples 545
23 AgileRL 488
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com