Top 23 Distributed Open-Source Projects
An Open Source Machine Learning Framework for EveryoneProject mention: Google: Quietly Killing It in 2021 (YTD performance vs social media chatter) | reddit.com/r/investing | 2021-06-16
Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit (by microsoft)
Scout APM - Leading-edge performance monitoring starting at $39/month. Scout APM uses tracing logic that ties bottlenecks to source code so you know the exact line of code causing performance issues and can get back to building a great product faster.
Peace of mind from prototype to productionProject mention: My Journey Into Elixir | dev.to | 2021-06-03
That's how the my journey began with Elixir. It has been a slow journey but currently I am learning the Phoenix framework for building web applications for a project I am planning to undertake.
Redisson - Redis Java client with features of In-Memory Data Grid. Over 50 Redis based Java objects and services: Set, Multimap, SortedSet, Map, List, Queue, Deque, Semaphore, Lock, AtomicLong, Map Reduce, Publish / Subscribe, Bloom filter, Spring Cache, Tomcat, Scheduler, JCache API, Hibernate, MyBatis, RPC, local cache ...
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.Project mention: Ray 1.4.0 | news.ycombinator.com | 2021-06-08
Native GraphQL Database with graph backendProject mention: Need help in choosing a database - Postgres or BadgerDB | reddit.com/r/Database | 2021-05-08
Dgraph is a highly scalable hyper fast graph database that is distributed, and is built on top of Badger. For consensus, it Raft protocol. (Git repo https://github.com/dgraph-io/dgraph)
☁️ Nextcloud server, a safe home for all your dataProject mention: Safest way to have my notes accessible anywhere? | reddit.com/r/privacy | 2021-06-17
Maybe Nextcloud? It has built in Notes functions which you can use via the app or browser.
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
A privacy-aware, distributed, open source social network.Project mention: What decent alternatives to Facebook are there on the social media market? | reddit.com/r/facebook | 2021-06-13
There's the diaspora* project which is decentralized and focuses more on user freedom and privacy. Having no central server means there's no single entity to shut it down or to be bought out.
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.Project mention: Is it possible to clean memory after using a package that has a memory leak in my python script? | reddit.com/r/Python | 2021-04-29
I'm working on the AutoML python package (Github repo). In my package, I'm using many different algorithms. One of the algorithms is LightGBM. The algorithm after the training doesn't release the memory, even if del is called and gc.collect() after. I created the issue on LightGBM GitHub -> link. Because of this leak, memory consumption is growing very fast during algorithm training.
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.Project mention: [D] Efficient ways of choosing number of layers/neurons in a neural network | reddit.com/r/statistics | 2021-04-20
optuna, hyperopt, nni, plenty of less-known tools too.
Modin: Speed up your Pandas workflows by changing a single line of codeProject mention: How to Speed Up Pandas with 1 Line of Code | reddit.com/r/Python | 2021-03-03
Peer-to-Peer Databases for the Decentralized WebProject mention: How do I store mutualable data like Blogs and Comments on IPFS? | reddit.com/r/ipfs | 2021-06-13
I think the easiest would be to run a NodeJS API and use js-ipfs. Then simply upload to image to your API and upload the image to the IPFS network with their js lib.
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
A hyperparameter optimization frameworkProject mention: Do you often find hyperparam tuning does very little? | reddit.com/r/datascience | 2021-04-23
As for doing a full gridsearch, I recommend using a better strategy, e.g. bayesian optimization. Optuna is great for this.
Open Source Streaming Data PlatformProject mention: My Awesome Collections of 200+ github repo | dev.to | 2021-06-03
qTox is a chat, voice, video, and file transfer IM client using the encrypted peer-to-peer Tox protocol.Project mention: [Filter] Block "New! Messenger App for Windows" on Facebook Messenger | reddit.com/r/uBlockOrigin | 2021-05-11
Facebook zucks and I rate it 0/10, do not use if at all possible. Unfortunately, non-tech-savvy people don't tend to use alternatives. Best I can recommend at the moment is DeltaChat, since everyone's got email, or qTox for video and audio.
Fastest unstructured dataset management for TensorFlow/PyTorch. Stream data real-time & version-control it. http://activeloop.ai (by activeloopai)Project mention: [N] Access Google Objectron (~1.92 TBs) in less than 5 seconds with Activeloop Hub | reddit.com/r/MachineLearning | 2021-05-04
Install Hub, the open-source package that converts computer vision datasets into cloud-native NumPy-like arrays and enables a few nifty features like streaming to PyTorch and TensorFlow, dataset version-control, collaboration, etc.
ZoneMinder is a free, open source Closed-circuit television software application developed for Linux which supports IP, USB and Analog cameras.Project mention: What do you use your VMs for? | reddit.com/r/unRAID | 2021-04-29
CrateDB is a distributed SQL database that makes it simple to store and analyze massive amounts of machine data in real-time.Project mention: Querying time series data with SQL: examples | dev.to | 2021-03-01
PD: If you liked this post... We'd really appreciate a ⭐️ in Github!
Library for building distributed, real-time collaborative web applications
Enterprise job scheduling middleware with distributed computing ability.Project mention: PowerJob V3.4.3 has been released. Check to see the work. Suggestions are welcomed. | reddit.com/r/java | 2021-01-17
Oh yes! You can see the registered users in Known users. They are companies in China as we didn't promote to foreign friends. Cisco, Jd.com, OPPO are all big companies there in China.
What are some of the best open-source Distributed projects? This list will help you: