Distributed

Open-source projects categorized as Distributed Edit details

Top 23 Distributed Open-Source Projects

  • tensorflow

    An Open Source Machine Learning Framework for Everyone

    Project mention: [D] Google quietly moving its products from Tensorflow to JAX | reddit.com/r/MachineLearning | 2022-06-19

    Most likely due to this issue. :)

  • handson-ml

    A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in python using Scikit-Learn and TensorFlow.

    Project mention: need a book recommendation for machine learning on python | reddit.com/r/learnpython | 2022-05-25

    Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow is often recommended. You can check out the GitHub repo first: https://github.com/ageron/handson-ml

  • Scout APM

    Less time debugging, more time building. Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.

  • Ray

    An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

    Project mention: preprocessing millions of records - how to speed up the processing | reddit.com/r/datascience | 2022-06-03

    Dask, Ray(ray.io), or pyspark(if you have a cluster)

  • Nextcloud

    ☁️ Nextcloud server, a safe home for all your data

    Project mention: ¿What Cloud Storage can I use? | reddit.com/r/PrivacyGuides | 2022-06-25

    pCloud is good but if you want more control in case the company or server goes down then I would self-host nextcloud and exercise good backups with the 3-2-1 backup method. This would be a "centralized server" but its controlled by you which makes it more private. If you are absolutely set on using distributed storage (which I really advice against doing) then look into Tahoe-FAFS.

  • Redisson

    Redisson - Redis Java client with features of In-Memory Data Grid. Over 50 Redis based Java objects and services: Set, Multimap, SortedSet, Map, List, Queue, Deque, Semaphore, Lock, AtomicLong, Map Reduce, Publish / Subscribe, Bloom filter, Spring Cache, Tomcat, Scheduler, JCache API, Hibernate, MyBatis, RPC, local cache ...

    Project mention: Am I overlooking any potential issues that could arise from my implementation? | reddit.com/r/javahelp | 2022-05-28

    I came up empty handed in search of an alternative to Quartz for scheduling Crons in a clustered/distributed environment. Redisson has a scheduler, but it came with its own issues: - https://github.com/redisson/redisson/issues/4020 - https://github.com/redisson/redisson/issues/3991 - https://github.com/redisson/redisson/issues/4321

  • dgraph

    Native GraphQL Database with graph backend

    Project mention: Open Source Databases in Go | reddit.com/r/golang | 2022-06-08

    dgraph - Scalable, Distributed, Low Latency, High Throughput Graph Database.

  • Phoenix

    Peace of mind from prototype to production

    Project mention: Front end web development without node/npm | reddit.com/r/webdev | 2022-06-23

    I've not used it, but it sounds like you're looking for something like Phoenix?

  • JetBrains

    Developer Ecosystem Survey 2022. Take part in the Developer Ecosystem Survey 2022 by JetBrains and get a chance to win a Macbook, a Nvidia graphics card, or other prizes. We’ll create an infographic full of stats, and you’ll get personalized results so you can compare yourself with other developers.

  • CNTK

    Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

    Project mention: Worldwide building footprints derived from satellite imagery from Microsoft | reddit.com/r/gis | 2022-05-20
  • Bit

    A tool for composable software development.

    Project mention: Online component builder | reddit.com/r/reactjs | 2022-06-25

    Here is link number 1 - Previous text "Bit"

  • LightGBM

    A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

    Project mention: Search YouTube from the terminal written in python | reddit.com/r/Python | 2022-02-28

    Microsoft lightGBM. https://github.com/microsoft/LightGBM

  • diaspora*

    A privacy-aware, distributed, open source social network.

    Project mention: Diaspora: The online social world where you are in control | news.ycombinator.com | 2022-06-19
  • nni

    An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

    Project mention: Automated Machine Learning (AutoML) - 9 Different Ways with Microsoft AI | dev.to | 2021-10-04

    For a complete tutorial, navigate to this Jupyter Notebook: https://github.com/microsoft/nni/blob/master/examples/notebooks/tabular_data_classification_in_AML.ipynb

  • micro

    API first development platform (by micro)

    Project mention: Anyone needs a (long-term) contributor for their open source project written in Go? | reddit.com/r/golang | 2022-05-30

    There is one project that I know that might fits your interest https://github.com/micro/micro

  • nebula

    A distributed, fast open-source graph database featuring horizontal scalability and high availability (by vesoft-inc)

    Project mention: Nebula Graph v3.0.0 Release Note | dev.to | 2022-02-24

    Support backup and restore. https://github.com/vesoft-inc/nebula/pull/3469 https://github.com/vesoft-inc/nebula-agent/pull/1 https://github.com/vesoft-inc/nebula-br/pull/22

  • modin

    Modin: Scale your Pandas workflows by changing a single line of code

    Project mention: Modern Python Performance Considerations | news.ycombinator.com | 2022-05-05
  • orbit-db

    Peer-to-Peer Databases for the Decentralized Web

    Project mention: Ask HN: Is there a descentralized DB with a simple social conflict resolution? | news.ycombinator.com | 2022-05-17

    I've been thinking it might be practical to build a simple decentralized database, where agents just know each other, so conflict resolution does not need to be so strong and can rely on the social layer.

    I think this applies to most databases, but I'm particularly thinking of internal enterprise databases, some social networks, any federated database system, and different devices of a single user

    I'm thinking of this features:

    1- Append-only?, full history of operations. Deletes / edits do not remove data, they only modify the "active state"

    2- Agents are public keys or similar (DIDs?)

    3- Operations are signed, and receivers verify if operation is valid, and sender is allowed

    4- Operations form a Merkel-DAG (similar to git, they link to the tips of current "active state", like a commit/merge in git)

    So far I think I've basically described [OrbitDB](https://github.com/orbitdb/orbit-db)

    Consensus is where things get real hard, [OrbitDb seems to use a last-write-wins CRDT](https://news.ycombinator.com/item?id=22920204), and although I don't know the details of orbitDb, I think for many simple use-cases, conflicts can just be resolved on the social layer. But I think we need to provide agents with good tools to resolve conflicts

    I'll try my best here with some ideas:

    - When merging, we can order operations by their timestamp, if operations enter conflict, raise it to the conflicting agents, or someone with permission to solve them.

    If an agent makes public an operation that forks its own history, mark agent as malicious or compromised, alert other agents, this needs resolution on the social layer, you have proof of misconduct, an agent has signed diverging operations

    Any operation becomes fully settled if you have proof that all agents of your system have referenced it directly or indirectly through newer operations.

    Timestamps can be upgraded by using @opentimestamps to get proof that an operation existed at time X (prevents creation of operations in hindsight). Though this does not prove operation has been made public

  • ipfs

    IPFS implementation in JavaScript

    Project mention: Public CDNs Are Useless and Dangerous | news.ycombinator.com | 2022-06-09

    You could include js-ipfs[0] and fetch all your resources from IPFS without going through a gateway. However, this approach would make the site fully dependent on JavaScript.

    A PWA with a Service Worker could perhaps implement its own client-side "gateway", translating public gateway URLs into direct IPFS access. Without the Service Worker (or without JS) it would fall back to using the gateway.

    [0] https://js.ipfs.io/

  • optuna

    A hyperparameter optimization framework

    Project mention: Optuna: An open source hyperparameter optimization framework to automate hyperparameter search | reddit.com/r/mltraders | 2022-06-21
  • H2O

    H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

    Project mention: A Tiny Grammar of Graphics | news.ycombinator.com | 2022-06-14
  • scrapy-redis

    Redis-based components for Scrapy.

    Project mention: How can I clone a github project to offline machine ? | reddit.com/r/learnpython | 2022-04-02

    git clone https://github.com/darkrho/scrapy-redis.git cd scrapy-redis python setup.py install

  • Hazelcast

    Open-source distributed computation and storage platform

    Project mention: Show HN: Hazelcast 5 BETA – streaming+storage in one | news.ycombinator.com | 2021-07-16
  • oceanbase

    OceanBase is an enterprise distributed relational database with high availability, high performance, horizontal scalability, and compatibility with SQL standards.

    Project mention: OceanBase | reddit.com/r/devopspro | 2021-12-19
  • qTox

    qTox is a chat, voice, video, and file transfer IM client using the encrypted peer-to-peer Tox protocol.

    Project mention: Chat control: Leaked Commission paper EU mass surveillance plans | reddit.com/r/europe | 2022-05-11

    Tox (anonymous too, if routed through Tor) [clients like qTox might not be the "sexiest", but they are small and can be used almost anywhere]

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2022-06-25.

Distributed related posts

Index

What are some of the best open-source Distributed projects? This list will help you:

Project Stars
1 tensorflow 165,908
2 handson-ml 24,521
3 Ray 21,011
4 Nextcloud 19,457
5 Redisson 19,213
6 dgraph 18,144
7 Phoenix 18,098
8 CNTK 17,168
9 Bit 15,331
10 LightGBM 13,906
11 diaspora* 13,076
12 nni 11,620
13 micro 11,182
14 nebula 7,561
15 modin 7,518
16 orbit-db 6,989
17 ipfs 6,819
18 optuna 6,549
19 H2O 5,856
20 scrapy-redis 5,109
21 Hazelcast 4,895
22 oceanbase 4,377
23 qTox 4,098
Find remote jobs at our new job board 99remotejobs.com. There are 4 new remote jobs listed recently.
Are you hiring? Post a new remote job listing for free.
Static code analysis for 29 languages.
Your projects are multi-language. So is SonarQube analysis. Find Bugs, Vulnerabilities, Security Hotspots, and Code Smells so you can release quality code every time. Get started analyzing your projects today for free.
www.sonarqube.org