C++ Distributed Computing

Open-source C++ projects categorized as Distributed Computing | Edit details

Top 4 C++ Distributed Computing Projects

  • GitHub repo lizardfs

    LizardFS is an Open Source Distributed File System licensed under GPLv3.

    Project mention: cloud storage "merged" on multiple VPSes | reddit.com/r/linuxquestions | 2021-08-24

    Have a look at https://github.com/lizardfs/lizardfs perhaps is what you want

  • GitHub repo nebula

    A distributed block-based data storage and compute engine (by varchar-io)

    Project mention: Streaming multi-file SQL and CSV/TSV/etc., native/WASM and fastest CSV parser | news.ycombinator.com | 2022-01-14

    cool - I also hand crafted a CSV parser following RFC4180 a while ago, not sure if you have a repeatable way to benchmark the performance difference?

    https://github.com/varchar-io/nebula/blob/master/src/storage...

  • Scout APM

    Less time debugging, more time building. Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.

  • GitHub repo frovedis

    Framework of vectorized and distributed data analytics

    Project mention: NEC’s Forgotten FPUs | news.ycombinator.com | 2021-09-03

    All good questions.

    1) It is a custom instruction set, you can rean the ISA guide over at https://www.hpc.nec/documentation

    2) The main difference in simple terms is that AVX instructions have a fixed vector length (4, 8, 16 etc). With the SX the vector length is flexible so it can be 10, 4, anything up to the max_vlen (up to 256 on the latest ones). Essentially the idea is you have a single instruction that can replace a whole for loop. Without a good compiler though that means you have to re-write your nested loops.

    3) There's currently two options when it comes to the compiler, you can use the proprietary NCC or use the open source LLVM fork NEC has. NCC is less compatible than GCC/Clang (particularly modern C++17 is problematic) but has a lot of advanced algorithms for taking your loops and rewriting them and vectorizing them automatically. The LLVM-fork currently supports assembly instruction intrinsics but they are still working on contributing better loop auto-vectorization into LLVM.

    4) Porting software is not terribly difficult to get working, but quite a bit harder to get performing very well depending on the type of workload. Since the Scalar core is pretty standard, you can almost always take regular CPU code and get it running (unlike GPU code in general). If you don't leverage the vector processor though, the performance you get will be nothing special, especially at 1.6GHz. Most of the software made for it starts off as being CPU code and is then modified with pragmas or some refactoring to get it running with good performance on the VE. In almost all cases the resulting code still runs on a CPU just fine. One example of a project that supports both in a single code-base is the Frovedis framework[1].

    I think the chip deserves a little more interest than it does. It's one of the few accelerators that you can 1) Buy today, right now 2) Has open source drivers [2] 3) Can run tensorflow [3]. The lack of fp16 support really hurt it for Deep Learning but it's like having a 1080 with 48 GB of RAM, still lots of interesting things you can do with that.

    [1]: https://github.com/frovedis/frovedis

  • GitHub repo OSStreams

    Open-source, Cloud-native Streams

    Project mention: Lock-free, allocation-free, efficient thread pool | news.ycombinator.com | 2021-09-13

    Elastic Scheduling for Streaming Runtimes", https://www.scott-a-s.com/files/pldi2017_lf_elastic_scheduli.... The source code for the product implementation is now open source. Most is in https://github.com/IBMStreams/OSStreams/blob/main/src/cpp/SP... and https://github.com/IBMStreams/OSStreams/blob/main/src/cpp/SP....

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2022-01-14.

C++ Distributed Computing related posts

Index

What are some of the best open-source Distributed Computing projects in C++? This list will help you:

Project Stars
1 lizardfs 843
2 nebula 98
3 frovedis 58
4 OSStreams 11
Find remote jobs at our new job board 99remotejobs.com. There are 29 new remote jobs listed recently.
Are you hiring? Post a new remote job listing for free.
OPS - Build and Run Open Source Unikernels
Quickly and easily build and deploy open source unikernels in tens of seconds. Deploy in any language to any cloud.
github.com/nanovms