SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Distributed Open-Source Projects
-
Project mention: Las 10 Mejores Herramientas de Inteligencia Artificial de Código Abierto | dev.to | 2024-08-21
(https://dev-to-uploads.s3.amazonaws.com/uploads/articles/adae9icuiza0lhd532pc.png)
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
Ray
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Project mention: Amazon's Exabyte-Scale Migration from Apache Spark to Ray on Amazon EC2 | news.ycombinator.com | 2024-07-29Yeah, mmap, I think this is the relevant line [1].
Fun fact, very early on, we used to create one mmapped file per serialized object, but that very quickly broke down.
Then we switched to mmapping one large file at the start and storing all of the serialized objects in that file. But then as objects get allocated and deallocated, you need to manage the memory inside of that mmapped file, and we just repurposed a malloc implementation to handle that.
[1] https://github.com/ray-project/ray/blob/21202f6ddc3ceaf74fbc...
-
-
Project mention: Ask HN: Lesser-known/underrated cool new web-oriented tech? | news.ycombinator.com | 2024-07-23
I've been surveying the space lately and I re/discovered some really powerful new-ish tech which woke up my tech taste buds and am now looking for more such "tasty" tech (sorry I guess I'm due for a meal soon :P)
Example as starters:
- Qwik and resumable web apps (https://qwik.dev/)
- SurrealDB, maximally flexible multi-model DB (https://surrealdb.com/)
There are others, but I'm trying to keep to the starkest examples and not to influence the discussion too much.
I do think this is the best place to ask such questions - I'm explicitly interested in cutting-edge tech, but the edge doesn't have to be excessively sharp ;).
-
Project mention: Ask HN: Is Nextcloud a Great Alternative to Dropbox/Google Drive for Startups? | news.ycombinator.com | 2024-09-22
In my opinion it’s not a good alternative if you or your team members expect exactly the same quality of service. When you switch to Nextcloud you’ll have to expect more bugs, less reliability, less performance and obviously more maintenance (since it's typically self-hosted) compared to Google Drive, Dropbox or One Drive. So you'll have to go into this with a different kind of mindset. What you gain is independence and extendability due to a rather big platform ecosystem.
E.g. here are some specific things and examples of things you'll have to deal with, in no specific order. These are just some things I've had to deal with recently.
- You'll have to educate people in your group that there are at least 3 different ways to share files among each other and that they can all coexist in parallel (Individual Shares vs. Group Shares vs. Group folders vs. Circles/Teams) (I did a german blog post on this: https://bitbetter.de/blog/nextcloud-freigabe-chaos/)
- Handling of file/folder names with special characters is a mess e.g. if you have Windows and Linux clients there will most certainly be conflicts. (Luckily this has been fixed recently by the `forbidden_filename_characters` config option – which is not enforced yet via the Web UI) see https://github.com/nextcloud/ios/issues/2802
- Creating Nextcloud users with spaces in their names, will break CalDAV on iOS Devices (https://github.com/nextcloud/server/issues/15641)
- Nextcloud (aka Collabora) Office is very slow if you want to actually work collaboratively with it (no matter the power of your Collabora server) – unfortunately it's no match for Google Docs or Office 365
-
-
TDengine
High-performance, scalable time-series database designed for Industrial IoT (IIoT) scenarios
Project mention: TDengine: Open-Source, High-Performance Time-Series DB for IoT and Cloud | news.ycombinator.com | 2024-08-14 -
Redisson
Redisson - Easy Redis Java client and Real-Time Data Platform. Valkey compatible. Sync/Async/RxJava/Reactive API. Over 50 Redis or Valkey based Java objects and services: Set, Multimap, SortedSet, Map, List, Queue, Deque, Semaphore, Lock, AtomicLong, Map Reduce, Bloom filter, Spring, Tomcat, Scheduler, JCache API, Hibernate, RPC, local cache...
-
You've miraculously managed to install elixir, erlang, and friends on your Windows machine and you're ready to try out Phoenix. At some point in your tutorial you will be asked to run this command:
-
Dgraph — Distributed, fast graph database.
-
Project mention: Tools and libraries widely used in micro frontend architectures! | dev.to | 2024-08-09
Official Website
-
Github Source Code: CNTK
-
LightGBM
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
-
Project mention: Diaspora is a decentralized, federated alternative to Facebook that anyone can join and contribute to | /r/InnerNet | 2023-12-07
-
NebulaGraph Database
A distributed, fast open-source graph database featuring horizontal scalability and high availability (by vesoft-inc)
-
Project mention: Optuna – A Hyperparameter Optimization Framework | news.ycombinator.com | 2024-04-06
I didn’t even know WandB did hyperparameter optimization, I figured it was a neural network visualizer based on 2 minute papers. Didn’t seem like many alternatives out there to Optuna with TPE + persistence in conditional continuous & discrete spaces.
Anyway, it’s doable to make a multi objective decide_to_prune function with Optuna, here’s an example https://github.com/optuna/optuna/issues/3450#issuecomment-19...
-
-
-
H2O
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
-
-
-
-
Hazelcast
Hazelcast is a unified real-time data platform combining stream processing with a fast data store, allowing customers to act instantly on data-in-motion for real-time insights.
Distributed discussion
Distributed related posts
-
Multimodal Madness! Create a Product Recommender for Smart Shopping
-
Genetically synthesized supergain broadband wire-bundle antenna
-
Unified time series database for metrics, logs, and events written in Rust
-
EchoVault: Embeddable Redis Alternative in Go
-
Go Embeddable Redis Alternative
-
Enhancing the SQL Interval syntax: A story of Open Source contribution
-
OpenAI Acquires Rockset
-
A note from our sponsor - SaaSHub
www.saashub.com | 3 Oct 2024
Index
What are some of the best open-source Distributed projects? This list will help you:
Project | Stars | |
---|---|---|
1 | tensorflow | 185,741 |
2 | Ray | 33,194 |
3 | Milvus | 29,607 |
4 | surrealdb | 27,001 |
5 | Nextcloud | 26,844 |
6 | handson-ml | 25,166 |
7 | TDengine | 23,262 |
8 | Redisson | 23,238 |
9 | Phoenix | 21,255 |
10 | dgraph | 20,338 |
11 | Bit | 17,839 |
12 | CNTK | 17,500 |
13 | LightGBM | 16,562 |
14 | diaspora* | 13,389 |
15 | NebulaGraph Database | 10,675 |
16 | optuna | 10,633 |
17 | modin | 9,766 |
18 | orbitdb | 8,278 |
19 | H2O | 6,871 |
20 | PowerJob | 6,782 |
21 | Apache Storm | 6,589 |
22 | toydb | 6,122 |
23 | Hazelcast | 6,102 |