Sonar helps you commit clean C++ code every time. With over 550 unique rules to find C++ bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work. Learn more →
Top 10 C++ Distributed System Projects
-
xgboost
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
Project mention: PSA: You don't need fancy stuff to do good work. | reddit.com/r/datascience | 2023-05-09Finally, when it comes to building models and making predictions, Python and R have a plethora of options available. Libraries like scikit-learn, statsmodels, and TensorFlowin Python, or caret, randomForest, and xgboostin R, provide powerful machine learning algorithms and statistical models that can be applied to a wide range of problems. What's more, these libraries are open-source and have extensive documentation and community support, making it easy to learn and apply new techniques without needing specialized training or expensive software licenses.
-
NebulaGraph Database
A distributed, fast open-source graph database featuring horizontal scalability and high availability (by vesoft-inc)
A NoSQL graph database is a type of non-relational, distributed database which employs a graph model. NoSQL stands for “Not only SQL” and refers to a new breed of databases that differ from traditional relational databases in their data model and performance. Graph databases are especially useful for data associated with relationships—everything from friendships on social netwo#rks to equipment supply chains or business processes. They can quickly traverse vast amounts of linked data points to discover insights and hidden connections between entities, making them ideal for network analysis– such as financial fraud detection, recommendation engines and many other use cases– all while performing at scale.
-
Sonar
Write Clean C++ Code. Always.. Sonar helps you commit clean C++ code every time. With over 550 unique rules to find C++ bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.
-
Project mention: Lens Dashboard for monitoring multiple AKS/EKS/... clusters | reddit.com/r/kubernetes | 2023-05-25
Plenty of paid monitoring solutions out there. Instana is pretty slick. NewRelic has a new open source tool, https://github.com/pixie-io/pixie
-
service-fabric
Service Fabric is a distributed systems platform for packaging, deploying, and managing stateless and stateful distributed applications and containers at large scale.
The first on the list is the entry project .sfproj. They use a non-SDK-style project template with a bunch of .xml files for configuration but with no C# code. They require Fabric.MSBuild nuget to build and package Service Fabric apps. Unfortunately, the dotnet add package command won't update dependencies since they only support and non-SDK-style project template uses the package.config file to manage dependencies.
-
curve
Curve is a sandbox project hosted by the CNCF Foundation. It's cloud-native, high-performance, and easy to operate. Curve is an open-source distributed storage system for block and shared file storage. (by opencurve)
-
-
Brief Overview for any interested: Vineyard (v6d) is an in-memory immutable data manager that provides out-of-the-box high-level abstraction and zero-copy in-memory sharing for distributed data in big data tasks, such as graph analytics (e.g., GraphScope), numerical computing (e.g., Mars), and machine learning.
-
CodiumAI
TestGPT | Generating meaningful tests for busy devs. Get non-trivial tests (and trivial, too!) suggested right inside your IDE, so you can code smart, create more value, and stay confident when you push.
-
-
ScaleStore
This is the source code for our (Tobias Ziegler, Carsten Binnig and Viktor Leis) published paper at SIGMOD’22: ScaleStore: A Fast and Cost-Efficient Storage Engine using DRAM, NVMe, and RDMA.
Project mention: The end of a myth: Distributed transactions can scale | news.ycombinator.com | 2023-04-10The linked blog post at the top of this article - https://muratbuffalo.blogspot.com/2023/01/is-scalable-oltp-i... - provides graphics that give extremely useful context. And here's the repo for the paper that discusses: https://github.com/DataManagementLab/ScaleStore
The idea that one of many writer-compute-nodes can literally reach into a memory buffer that is shared across machines, atomically flip some lock bits and propagate some cache-coherence messages, and use that to build a multi-writer distributed database without needing to partition (and where any writer-compute-node can handle any message, so you can just round-robin a firehose of messages at them)... and that there's a chance (though not yet implemented) that one could implement ACID on top of this? It's absolute madness, and wildly exciting.
-
C++ Middleware Writer
The repo contains library code to support messaging and serialization. There are also two programs in the repo that are needed to use the CMW.
in one of my programs. I'm thinking about changing it to:
C++ Distributed Systems related posts
- Has anyone here had experience using Vineyard?
- NAS on a cluster
- I made an app that lets you search up (almost) anything Destiny has ever said in his streams
- XGBoost Save and Load Error
- Is there an S3 to Cloud Storage (Google drive/Dropbox etc) solution?
- For XGBoost (in Amazon SageMaker), one of the hyper parameters is num_round, for number of rounds to train. Does this mean cross validation?
- Manticore: a faster alternative to Elasticsearch in C++ with a 21-year history
-
A note from our sponsor - Sonar
www.sonarsource.com | 31 May 2023
Index
What are some of the best open-source Distributed System projects in C++? This list will help you:
Project | Stars | |
---|---|---|
1 | xgboost | 24,180 |
2 | NebulaGraph Database | 9,099 |
3 | pixie | 4,596 |
4 | service-fabric | 2,968 |
5 | curve | 1,892 |
6 | lizardfs | 918 |
7 | v6d | 718 |
8 | nebula | 133 |
9 | ScaleStore | 61 |
10 | C++ Middleware Writer | 48 |