InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →
Top 23 Distributed Open-Source Projects
-
Project mention: None of the top 10 projects in GitHub is actually a software project 🤯 | dev.to | 2025-05-10
We see an addition to the AI community with AutoGPT. Along with Tensorflow they represent the AI community in the software category, which is getting relevant (2 out of 8). We can expect in the future to have new AI projects in the top 25 such as Transformers or Ollama (currently top 34 and 36, respectively).
-
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
Project mention: Show HN: Hacker News historic upvote and score data | news.ycombinator.com | 2025-06-03
-
Ray
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Project mention: My personal favorite MCP server which has became part of my life | dev.to | 2025-05-27GitHub: github.com/ray-project/ray (Ray Serve is part of Ray)
-
Milvus
Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search
Project mention: Milvus and Late Chunking: What I Learned About Context-Aware Embedding in RAG | dev.to | 2025-06-11After embedding, I stored the chunk vectors in Milvus. I compared its native ANN search against a brute-force cosine similarity scan. Both approaches returned identical top-3 matches for queries like "What are new features in milvus 2.4.13". This gave me high confidence in Milvus’s indexing fidelity.
-
LocalAI
:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed, P2P inference
Project mention: Nvidia on NixOS WSL – Ollama up 24/7 on your gaming PC | news.ycombinator.com | 2025-04-10If you're going to run Ollama in Windows anyway, why not use the native build? And if you want to use WSL, then I'd sugggest using something like LocalAI which gives you a lot more control and support for additional formats (GGML, GGUF, GPTQ, ONNX, etc).
https://github.com/mudler/LocalAI
-
Indie hackers also leverage collaboration tools like Nextcloud for file sharing and team projects, and Mattermost or Rocket.Chat as self-hosted alternatives to Slack. These tools empower remote teams and foster efficient communication across diverse development projects.
-
Project mention: SurrealDB 2.2: Benchmarking, graph path algorithms and foreign key constraints | dev.to | 2025-03-17
To make this better, we've created a language testing suite similar to the ECMAscript conformance testing suite test262.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
1️⃣1️⃣ Practical Machine Learning 📈 📌 https://github.com/ageron/handson-ml Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow.
-
TDengine
High-performance, scalable time-series database designed for Industrial IoT (IIoT) scenarios
Project mention: Why SSDLC needs static analysis: a case study of 190 bugs in TDengine | dev.to | 2025-05-12We'll continue examining the TDengine project, which we've covered in three small notes on code refactoring:
-
Redisson
Redisson - Valkey & Redis Java client. Real-Time Data Platform. Sync/Async/RxJava/Reactive API. Over 50 Valkey and Redis based Java objects and services: Set, Multimap, SortedSet, Map, List, Queue, Deque, Semaphore, Lock, AtomicLong, Map Reduce, Bloom filter, Spring, Tomcat, Scheduler, JCache API, Hibernate, RPC, local cache..
Project mention: Feature Comparison: Reliable Queue vs. Valkey and Redis Stream | dev.to | 2025-05-15In the final verdict, Reliable Queue is the more durable and feature-rich option. Standard Valkey/Redis streams will suffice for smaller applications, but Reliable Queue provides the enterprise-grade reliability that businesses depend on. To learn more, visit the Redisson PRO website today.
-
Invisible Threads is built with Elixir, Phoenix, and most importantly, Postmark. Data lives on disk instead of a traditional database to keep the demo light. Authentication uses Postmark API tokens, mapping each application user directly to a Postmark server. The whole thing is deployed to Fly.io. A minimal setup let me focus on Postmark's offerings.
-
Project mention: Automatically Generate REST and GraphQL APIs From Your Database | dev.to | 2024-12-19
Dgraph
-
Bit
AI-powered development workspaces with reusable components, architectural clarity and zero overhead.
As part of my job, recently I'm working on integrating Vite (also Vitest) into a dev tool called Bit, which originally uses webpack in most of the cases. Basically, Bit is a component-driven development tool for various frontend frameworks and Node.js. In Bit, everything is a component and eventually consumed as an npm package. So technically, you would deal with all kinds of components as packages in your node_modules folder, whatever they are in CJS or ESM, need to be further transformed or not.
-
Github Source Code: CNTK
-
LightGBM
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
-
Project mention: Ask HN: Organize local communities without Facebook? | news.ycombinator.com | 2025-01-21
* Look into Diaspora. (https://diasporafoundation.org/). Upside: It's basically a self-hosted facebook. Really cool project. Downside: Unlike facebook, there's no fake/pushed content so it tended to feel stale.
* Look into hosting a forum (e.g. phpBB). Forums are excellent because they don't lose old information like facebook does. When someone says "Hey what's the policy on dogs?" three years later I can search "dogs" and find the answer. Downside: They're not pretty, not full of pictures and no infinite scrollingz. sadge alfababies.
* IRC chat. I hosted an IRC group for several years at work and it worked great. We only killed it when we decided to move to an enterprise communication app.
-
Project mention: Intro to Machine Learning: A Practical Guide for Curious Coders | dev.to | 2025-05-12
Hyper‑parameter tuning: GridSearchCV / RandomizedSearchCV or advanced tools like Optuna.
-
NebulaGraph Database
A distributed, fast open-source graph database featuring horizontal scalability and high availability (by vesoft-inc)
-
-
-
Project mention: Every System is a Log: Avoiding coordination in distributed applications | news.ycombinator.com | 2025-01-24
There’s also OrbitDB https://github.com/orbitdb/orbitdb which to my understanding has been a pioneer for p2p logs, databases and CRDTs.
-
-
H2O
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
View the Project on GitHub
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Distributed discussion
Distributed related posts
-
Invisible Threads: Group email without the exposure
-
Hatchet
-
How to Build a Streaming Deduplication Pipeline with Kafka, GlassFlow, and ClickHouse
-
Optimizing ML Training with Metagradient Descent
-
Show HN: SuperMassive – Distributed scalable key-value database in 100% GO
-
SurrealDB 2.2: Benchmarking, graph path algorithms and foreign key constraints
-
Show HN: Sample NCSA Log Generator
-
A note from our sponsor - InfluxDB
www.influxdata.com | 14 Jun 2025
Index
What are some of the best open-source Distributed projects? This list will help you:
# | Project | Stars |
---|---|---|
1 | tensorflow | 190,341 |
2 | ClickHouse | 41,164 |
3 | Ray | 37,508 |
4 | Milvus | 35,322 |
5 | LocalAI | 33,172 |
6 | Nextcloud | 29,788 |
7 | surrealdb | 29,384 |
8 | handson-ml | 25,329 |
9 | TDengine | 23,948 |
10 | Redisson | 23,843 |
11 | Phoenix | 22,159 |
12 | dgraph | 20,923 |
13 | Bit | 18,104 |
14 | CNTK | 17,579 |
15 | LightGBM | 17,286 |
16 | diaspora* | 13,541 |
17 | optuna | 12,106 |
18 | NebulaGraph Database | 11,391 |
19 | modin | 10,190 |
20 | oneflow | 8,906 |
21 | orbitdb | 8,538 |
22 | PowerJob | 7,489 |
23 | H2O | 7,191 |