Open-source projects categorized as Distributed | Edit details

Top 23 Distributed Open-Source Projects

  • GitHub repo tensorflow

    An Open Source Machine Learning Framework for Everyone

    Project mention: Google: Quietly Killing It in 2021 (YTD performance vs social media chatter) | reddit.com/r/investing | 2021-06-16
  • GitHub repo CNTK

    Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit (by microsoft)

  • GitHub repo Phoenix

    Peace of mind from prototype to production

    Project mention: My Journey Into Elixir | dev.to | 2021-06-03

    That's how the my journey began with Elixir. It has been a slow journey but currently I am learning the Phoenix framework for building web applications for a project I am planning to undertake.

  • GitHub repo Redisson

    Redisson - Redis Java client with features of In-Memory Data Grid. Over 50 Redis based Java objects and services: Set, Multimap, SortedSet, Map, List, Queue, Deque, Semaphore, Lock, AtomicLong, Map Reduce, Publish / Subscribe, Bloom filter, Spring Cache, Tomcat, Scheduler, JCache API, Hibernate, MyBatis, RPC, local cache ...

  • GitHub repo Ray

    An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

    Project mention: Ray 1.4.0 | news.ycombinator.com | 2021-06-08
  • GitHub repo dgraph

    Native GraphQL Database with graph backend

    Project mention: Need help in choosing a database - Postgres or BadgerDB | reddit.com/r/Database | 2021-05-08

    Dgraph is a highly scalable hyper fast graph database that is distributed, and is built on top of Badger. For consensus, it Raft protocol. (Git repo https://github.com/dgraph-io/dgraph)

  • GitHub repo Nextcloud

    ☁️ Nextcloud server, a safe home for all your data

    Project mention: Safest way to have my notes accessible anywhere? | reddit.com/r/privacy | 2021-06-17

    Maybe Nextcloud? It has built in Notes functions which you can use via the app or browser.

  • GitHub repo diaspora*

    A privacy-aware, distributed, open source social network.

    Project mention: What decent alternatives to Facebook are there on the social media market? | reddit.com/r/facebook | 2021-06-13

    There's the diaspora* project which is decentralized and focuses more on user freedom and privacy. Having no central server means there's no single entity to shut it down or to be bought out.

  • GitHub repo LightGBM

    A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

    Project mention: Is it possible to clean memory after using a package that has a memory leak in my python script? | reddit.com/r/Python | 2021-04-29

    I'm working on the AutoML python package (Github repo). In my package, I'm using many different algorithms. One of the algorithms is LightGBM. The algorithm after the training doesn't release the memory, even if del is called and gc.collect() after. I created the issue on LightGBM GitHub -> link. Because of this leak, memory consumption is growing very fast during algorithm training.

  • GitHub repo nni

    An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

    Project mention: [D] Efficient ways of choosing number of layers/neurons in a neural network | reddit.com/r/statistics | 2021-04-20

    optuna, hyperopt, nni, plenty of less-known tools too.

  • GitHub repo modin

    Modin: Speed up your Pandas workflows by changing a single line of code

    Project mention: How to Speed Up Pandas with 1 Line of Code | reddit.com/r/Python | 2021-03-03
  • GitHub repo orbit-db

    Peer-to-Peer Databases for the Decentralized Web

    Project mention: How do I store mutualable data like Blogs and Comments on IPFS? | reddit.com/r/ipfs | 2021-06-13

    FYI: https://github.com/orbitdb/orbit-db

  • GitHub repo ipfs

    IPFS implementation in JavaScript

    Project mention: Uploading an image to IPFS from an Android phone? | reddit.com/r/ipfs | 2021-03-29

    I think the easiest would be to run a NodeJS API and use js-ipfs. Then simply upload to image to your API and upload the image to the IPFS network with their js lib.

  • GitHub repo H2O

    H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

  • GitHub repo optuna

    A hyperparameter optimization framework

    Project mention: Do you often find hyperparam tuning does very little? | reddit.com/r/datascience | 2021-04-23

    As for doing a full gridsearch, I recommend using a better strategy, e.g. bayesian optimization. Optuna is great for this.

  • GitHub repo Hazelcast

    Open Source Streaming Data Platform

    Project mention: My Awesome Collections of 200+ github repo | dev.to | 2021-06-03
  • GitHub repo qTox

    qTox is a chat, voice, video, and file transfer IM client using the encrypted peer-to-peer Tox protocol.

    Project mention: [Filter] Block "New! Messenger App for Windows" on Facebook Messenger | reddit.com/r/uBlockOrigin | 2021-05-11

    Facebook zucks and I rate it 0/10, do not use if at all possible. Unfortunately, non-tech-savvy people don't tend to use alternatives. Best I can recommend at the moment is DeltaChat, since everyone's got email, or qTox for video and audio.

  • GitHub repo Hub

    Fastest unstructured dataset management for TensorFlow/PyTorch. Stream data real-time & version-control it. http://activeloop.ai (by activeloopai)

    Project mention: [N] Access Google Objectron (~1.92 TBs) in less than 5 seconds with Activeloop Hub | reddit.com/r/MachineLearning | 2021-05-04

    Install Hub, the open-source package that converts computer vision datasets into cloud-native NumPy-like arrays and enables a few nifty features like streaming to PyTorch and TensorFlow, dataset version-control, collaboration, etc.

  • GitHub repo Zoneminder

    ZoneMinder is a free, open source Closed-circuit television software application developed for Linux which supports IP, USB and Analog cameras.

    Project mention: What do you use your VMs for? | reddit.com/r/unRAID | 2021-04-29
  • GitHub repo Crate

    CrateDB is a distributed SQL database that makes it simple to store and analyze massive amounts of machine data in real-time.

    Project mention: Querying time series data with SQL: examples | dev.to | 2021-03-01

    PD: If you liked this post... We'd really appreciate a ⭐️ in Github!

  • GitHub repo FluidFramework

    Library for building distributed, real-time collaborative web applications

    Project mention: The Lost Apps of the 80s | news.ycombinator.com | 2021-04-04

    Within the context of the Microsoft-verse, Fluid Framework (https://fluidframework.com) is supposed to be solving similar problems in web apps, although I haven't personally played with it.

  • GitHub repo PowerJob

    Enterprise job scheduling middleware with distributed computing ability.

    Project mention: PowerJob V3.4.3 has been released. Check to see the work. Suggestions are welcomed. | reddit.com/r/java | 2021-01-17

    Oh yes! You can see the registered users in Known users. They are companies in China as we didn't promote to foreign friends. Cisco, Jd.com, OPPO are all big companies there in China.

  • GitHub repo lingvo


NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2021-06-17.


What are some of the best open-source Distributed projects? This list will help you:

Project Stars
1 tensorflow 156,752
2 CNTK 17,030
3 Phoenix 16,776
4 Redisson 16,678
5 Ray 16,286
6 dgraph 16,235
7 Nextcloud 14,385
8 diaspora* 12,741
9 LightGBM 12,608
10 nni 9,769
11 modin 6,120
12 orbit-db 5,766
13 ipfs 5,464
14 H2O 5,388
15 optuna 4,690
16 Hazelcast 4,393
17 qTox 3,607
18 Hub 3,285
19 Zoneminder 3,234
20 Crate 3,107
21 FluidFramework 3,067
22 PowerJob 2,831
23 lingvo 2,245