dmtcp VS frovedis

Compare dmtcp vs frovedis and see what are their differences.


DMTCP: Distributed MultiThreaded CheckPointing (by dmtcp)


Framework of vectorized and distributed data analytics (by frovedis)
Our great sponsors
  • InfluxDB - Build time-series-based applications quickly and at scale.
  • Sonar - Write Clean C++ Code. Always.
  • SaaSHub - Software Alternatives and Reviews
dmtcp frovedis
3 1
316 64
0.3% -
7.9 9.0
2 months ago 19 days ago
C++ C++
GNU General Public License v3.0 or later BSD 2-clause "Simplified" License
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.


Posts with mentions or reviews of dmtcp. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-01-06.


Posts with mentions or reviews of frovedis. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-09-03.
  • NEC’s Forgotten FPUs
    3 projects | | 3 Sep 2021
    All good questions.

    1) It is a custom instruction set, you can rean the ISA guide over at

    2) The main difference in simple terms is that AVX instructions have a fixed vector length (4, 8, 16 etc). With the SX the vector length is flexible so it can be 10, 4, anything up to the max_vlen (up to 256 on the latest ones). Essentially the idea is you have a single instruction that can replace a whole for loop. Without a good compiler though that means you have to re-write your nested loops.

    3) There's currently two options when it comes to the compiler, you can use the proprietary NCC or use the open source LLVM fork NEC has. NCC is less compatible than GCC/Clang (particularly modern C++17 is problematic) but has a lot of advanced algorithms for taking your loops and rewriting them and vectorizing them automatically. The LLVM-fork currently supports assembly instruction intrinsics but they are still working on contributing better loop auto-vectorization into LLVM.

    4) Porting software is not terribly difficult to get working, but quite a bit harder to get performing very well depending on the type of workload. Since the Scalar core is pretty standard, you can almost always take regular CPU code and get it running (unlike GPU code in general). If you don't leverage the vector processor though, the performance you get will be nothing special, especially at 1.6GHz. Most of the software made for it starts off as being CPU code and is then modified with pragmas or some refactoring to get it running with good performance on the VE. In almost all cases the resulting code still runs on a CPU just fine. One example of a project that supports both in a single code-base is the Frovedis framework[1].

    I think the chip deserves a little more interest than it does. It's one of the few accelerators that you can 1) Buy today, right now 2) Has open source drivers [2] 3) Can run tensorflow [3]. The lack of fp16 support really hurt it for Deep Learning but it's like having a 1080 with 48 GB of RAM, still lots of interesting things you can do with that.


What are some alternatives?

When comparing dmtcp and frovedis you can also consider the following projects:

h5cpp - C++17 templates between [stl::vector | armadillo | eigen3 | ublas | blitz++] and HDF5 datasets

geni - A Clojure dataframe library that runs on Spark

ve_drv-kmod - SX-Aurora TSUBASA Vector Engine device driver kernel module

data-science-ipython-notebooks - Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

interpret - Fit interpretable models. Explain blackbox machine learning.

libgrape-lite - πŸ‡ A C++ library for parallel graph processing (GRAPE) πŸ‡

faasm - High-performance stateful serverless runtime based on WebAssembly

amgcl - C++ library for solving large sparse linear systems with algebraic multigrid method

Kratos - Kratos Multiphysics (A.K.A Kratos) is a framework for building parallel multi-disciplinary simulation software. Modularity, extensibility and HPC are the main objectives. Kratos has BSD license and is written in C++ with extensive Python interface.

ravel - Ravel MPI trace visualization tool

tensorflow - TensorFlow for SX-Aurora TSUBASA forked from

timemory - Modular C++ Toolkit for Performance Analysis and Logging. Profiling API and Tools for C, C++, CUDA, Fortran, and Python. The C++ template API is essentially a framework to creating tools: it is designed to provide a unifying interface for recording various performance measurements alongside data logging and interfaces to other tools.