SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Distributed Open-Source Projects
-
Ray
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
TDengine
TDengine is an open source, high-performance, cloud native time-series database optimized for Internet of Things (IoT), Connected Cars, Industrial IoT and DevOps.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
Redisson
Redisson - Easy Redis Java client and Real-Time Data Platform. Sync/Async/RxJava/Reactive API. Over 50 Redis based Java objects and services: Set, Multimap, SortedSet, Map, List, Queue, Deque, Semaphore, Lock, AtomicLong, Map Reduce, Bloom filter, Spring Cache, Tomcat, Scheduler, JCache API, Hibernate, RPC, local cache ...
-
LightGBM
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
-
nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
-
NebulaGraph Database
A distributed, fast open-source graph database featuring horizontal scalability and high availability (by vesoft-inc)
-
oceanbase
OceanBase is an enterprise distributed relational database with high availability, high performance, horizontal scalability, and compatibility with SQL standards.
-
H2O
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Project mention: TensorFlow-metal on Apple Mac is junk for training | news.ycombinator.com | 2024-01-16
22. Ray | Github | tutorial
Zilliz (zilliz.com) | Hybrid/ONSITE (SF, NYC) | Full-time
I am part of the hiring team for DevRel
NYC - https://boards.greenhouse.io/zilliz/jobs/4307910005
SF - https://boards.greenhouse.io/zilliz/jobs/4317590005
Zilliz is the company behind Milvus (https://github.com/milvus-io/milvus), the most starred vector database on GitHub. Milvus is a distributed vector database that shines in 1B+ vector use cases. Examples include autonomous driving, e-commerce, and drug discovery. (and, of course, RAG)
We are also hiring for other roles that I am not personally involved in the hiring process for such as product managers, software engineers, and recruiters.
Project mention: Happy 20th Anniversary, Gmail. I'm Sorry I'm Leaving You | news.ycombinator.com | 2024-04-15It really is hard to leave Gmail when all of your data has been conveniently stored therein. This is one of Google's retention strategies and it is indeed brilliant.
That said, there's a vast number of self-hosted alternatives like Stalwart Mail (email) [1], Immich (images) [2], NextCloud (Google Docs) [3], etc.
[1] https://stalwa.rt
[2] https://immich.app
[3] https://nextcloud.com/
In this article, I have shared how I have built a simple task-tracking full-stack application using NextJS and SurrealDB.
A standard Phoenix app contains a priv/repo/seeds.exs script file, which populates a database when it is run, so that developers can work with a conveniently prepared environment.
Project mention: Theming using CSS Variables? Turn Them into VS Code Snippets for Faster, Error-Free Coding | dev.to | 2024-04-14Our demo solution was built using Bit, which allows us to create shareable components, render component “previews,” generate component docs, and so on.
Project mention: SIRUS.jl: Interpretable Machine Learning via Rule Extraction | /r/Julia | 2023-06-29SIRUS.jl is a pure Julia implementation of the SIRUS algorithm by Bénard et al. (2021). The algorithm is a rule-based machine learning model meaning that it is fully interpretable. The algorithm does this by firstly fitting a random forests and then converting this forest to rules. Furthermore, the algorithm is stable and achieves a predictive performance that is comparable to LightGBM, a state-of-the-art gradient boosting model created by Microsoft. Interpretability, stability, and predictive performance are described in more detail below.
Project mention: Diaspora is a decentralized, federated alternative to Facebook that anyone can join and contribute to | /r/InnerNet | 2023-12-07
Project mention: Optuna – A Hyperparameter Optimization Framework | news.ycombinator.com | 2024-04-06I didn’t even know WandB did hyperparameter optimization, I figured it was a neural network visualizer based on 2 minute papers. Didn’t seem like many alternatives out there to Optuna with TPE + persistence in conditional continuous & discrete spaces.
Anyway, it’s doable to make a multi objective decide_to_prune function with Optuna, here’s an example https://github.com/optuna/optuna/issues/3450#issuecomment-19...
Project mention: OrbitDB reaches version 1.0 after 8 years of development | news.ycombinator.com | 2023-09-19
Project mention: Show HN: OceanBase – An open-source distributed SQL database written in C++ | news.ycombinator.com | 2023-05-23
I would use H20 if I were you. You can try out LLMs with a nice GUI. Unless you have some familiarity with the tools needed to run these projects, it can be frustrating. https://h2o.ai/
Distributed related posts
- Wasmcloud 1.0 Release Notes
- CNCF WasmCloud 1.0
- Turso + PHP - The LibSQL Client for PHP
- Distributed SQLite: Paradigm shift or hype?
- GreptimeDB: A fast and cost-effective alternative to InfluxDB
- Optuna – A Hyperparameter Optimization Framework
- Embeddable, Distributed In-Memory datastore compatible with Redis clients
-
A note from our sponsor - SaaSHub
www.saashub.com | 23 Apr 2024
Index
What are some of the best open-source Distributed projects? This list will help you:
Project | Stars | |
---|---|---|
1 | tensorflow | 182,323 |
2 | Ray | 30,988 |
3 | Milvus | 26,645 |
4 | Nextcloud | 25,494 |
5 | handson-ml | 25,090 |
6 | surrealdb | 25,126 |
7 | TDengine | 22,789 |
8 | Redisson | 22,706 |
9 | Phoenix | 20,558 |
10 | dgraph | 20,046 |
11 | Bit | 17,546 |
12 | CNTK | 17,435 |
13 | LightGBM | 16,043 |
14 | nni | 13,726 |
15 | diaspora* | 13,341 |
16 | NebulaGraph Database | 10,114 |
17 | optuna | 9,615 |
18 | modin | 9,465 |
19 | orbitdb | 8,114 |
20 | oceanbase | 7,340 |
21 | H2O | 6,721 |
22 | Apache Storm | 6,532 |
23 | PowerJob | 6,457 |
Sponsored