Distributed

Top 23 Distributed Open-Source Projects

  • tensorflow

    An Open Source Machine Learning Framework for Everyone

    Project mention: TensorFlow-metal on Apple Mac is junk for training | news.ycombinator.com | 2024-01-16
  • Ray

    Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

    Project mention: Open Source Advent Fun Wraps Up! | dev.to | 2024-01-05

    22. Ray | Github | tutorial

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • Milvus

    A cloud-native vector database, storage for next generation AI applications

    Project mention: Ask HN: Who is hiring? (April 2024) | news.ycombinator.com | 2024-04-01

    Zilliz (zilliz.com) | Hybrid/ONSITE (SF, NYC) | Full-time

    I am part of the hiring team for DevRel

    NYC - https://boards.greenhouse.io/zilliz/jobs/4307910005

    SF - https://boards.greenhouse.io/zilliz/jobs/4317590005

    Zilliz is the company behind Milvus (https://github.com/milvus-io/milvus), the most starred vector database on GitHub. Milvus is a distributed vector database that shines in 1B+ vector use cases. Examples include autonomous driving, e-commerce, and drug discovery. (and, of course, RAG)

    We are also hiring for other roles that I am not personally involved in the hiring process for such as product managers, software engineers, and recruiters.

  • Nextcloud

    ☁️ Nextcloud server, a safe home for all your data

    Project mention: Happy 20th Anniversary, Gmail. I'm Sorry I'm Leaving You | news.ycombinator.com | 2024-04-15

    It really is hard to leave Gmail when all of your data has been conveniently stored therein. This is one of Google's retention strategies and it is indeed brilliant.

    That said, there's a vast number of self-hosted alternatives like Stalwart Mail (email) [1], Immich (images) [2], NextCloud (Google Docs) [3], etc.

    [1] https://stalwa.rt

    [2] https://immich.app

    [3] https://nextcloud.com/

  • handson-ml

    ⛔️ DEPRECATED – See https://github.com/ageron/handson-ml3 instead.

  • surrealdb

    A scalable, distributed, collaborative, document-graph database, for the realtime web

    Project mention: Task tracker application using NextJS and SurrealDB | dev.to | 2024-01-21

    In this article, I have shared how I have built a simple task-tracking full-stack application using NextJS and SurrealDB.

  • TDengine

    TDengine is an open source, high-performance, cloud native time-series database optimized for Internet of Things (IoT), Connected Cars, Industrial IoT and DevOps.

    Project mention: TDengine: NEW Data - star count:22190.0 | /r/algoprojects | 2023-11-14
  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • Redisson

    Redisson - Easy Redis Java client with features of In-Memory Data Grid. Sync/Async/RxJava/Reactive API. Over 50 Redis based Java objects and services: Set, Multimap, SortedSet, Map, List, Queue, Deque, Semaphore, Lock, AtomicLong, Map Reduce, Bloom filter, Spring Cache, Tomcat, Scheduler, JCache API, Hibernate, RPC, local cache ...

  • Phoenix

    Peace of mind from prototype to production

    Project mention: Idempotent seeds in Elixir | dev.to | 2024-03-14

    A standard Phoenix app contains a priv/repo/seeds.exs script file, which populates a database when it is run, so that developers can work with a conveniently prepared environment.

  • dgraph

    The high-performance database for modern applications

    Project mention: DGraph – GraphQL Database | news.ycombinator.com | 2024-03-12
  • Bit

    A build system for development of composable software.

    Project mention: Theming using CSS Variables? Turn Them into VS Code Snippets for Faster, Error-Free Coding | dev.to | 2024-04-14

    Our demo solution was built using Bit, which allows us to create shareable components, render component “previews,” generate component docs, and so on.

  • CNTK

    Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

  • LightGBM

    A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

    Project mention: SIRUS.jl: Interpretable Machine Learning via Rule Extraction | /r/Julia | 2023-06-29

    SIRUS.jl is a pure Julia implementation of the SIRUS algorithm by Bénard et al. (2021). The algorithm is a rule-based machine learning model meaning that it is fully interpretable. The algorithm does this by firstly fitting a random forests and then converting this forest to rules. Furthermore, the algorithm is stable and achieves a predictive performance that is comparable to LightGBM, a state-of-the-art gradient boosting model created by Microsoft. Interpretability, stability, and predictive performance are described in more detail below.

  • nni

    An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

  • diaspora*

    A privacy-aware, distributed, open source social network.

    Project mention: Diaspora is a decentralized, federated alternative to Facebook that anyone can join and contribute to | /r/InnerNet | 2023-12-07
  • NebulaGraph Database

    A distributed, fast open-source graph database featuring horizontal scalability and high availability (by vesoft-inc)

  • optuna

    A hyperparameter optimization framework

    Project mention: Optuna – A Hyperparameter Optimization Framework | news.ycombinator.com | 2024-04-06

    I didn’t even know WandB did hyperparameter optimization, I figured it was a neural network visualizer based on 2 minute papers. Didn’t seem like many alternatives out there to Optuna with TPE + persistence in conditional continuous & discrete spaces.

    Anyway, it’s doable to make a multi objective decide_to_prune function with Optuna, here’s an example https://github.com/optuna/optuna/issues/3450#issuecomment-19...

  • modin

    Modin: Scale your Pandas workflows by changing a single line of code

    Project mention: The Distributed Tensor Algebra Compiler (2022) | news.ycombinator.com | 2023-06-15
  • orbitdb

    Peer-to-Peer Databases for the Decentralized Web

    Project mention: OrbitDB reaches version 1.0 after 8 years of development | news.ycombinator.com | 2023-09-19
  • oceanbase

    OceanBase is an enterprise distributed relational database with high availability, high performance, horizontal scalability, and compatibility with SQL standards.

    Project mention: Show HN: OceanBase – An open-source distributed SQL database written in C++ | news.ycombinator.com | 2023-05-23
  • H2O

    H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

    Project mention: Really struggling with open source models | /r/LocalLLaMA | 2023-07-12

    I would use H20 if I were you. You can try out LLMs with a nice GUI. Unless you have some familiarity with the tools needed to run these projects, it can be frustrating. https://h2o.ai/

  • Apache Storm

    Apache Storm

  • PowerJob

    Enterprise job scheduling middleware with distributed computing ability.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2024-04-15.

Distributed related posts

Index

What are some of the best open-source Distributed projects? This list will help you:

Project Stars
1 tensorflow 182,173
2 Ray 30,879
3 Milvus 26,490
4 Nextcloud 25,448
5 handson-ml 25,085
6 surrealdb 25,081
7 TDengine 22,764
8 Redisson 22,677
9 Phoenix 20,545
10 dgraph 20,030
11 Bit 17,528
12 CNTK 17,435
13 LightGBM 16,025
14 nni 13,708
15 diaspora* 13,340
16 NebulaGraph Database 10,088
17 optuna 9,583
18 modin 9,453
19 orbitdb 8,103
20 oceanbase 7,302
21 H2O 6,705
22 Apache Storm 6,526
23 PowerJob 6,446
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com