kokkos
Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction (by kokkos)
kokkos-kernels
Kokkos C++ Performance Portability Programming Ecosystem: Math Kernels - Provides BLAS, Sparse BLAS and Graph Kernels (by kokkos)
Our great sponsors
kokkos | kokkos-kernels | |
---|---|---|
4 | 1 | |
1,723 | 276 | |
3.0% | 3.3% | |
9.8 | 9.1 | |
1 day ago | 4 days ago | |
C++ | C++ | |
GNU General Public License v3.0 or later | GNU General Public License v3.0 or later |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
kokkos
Posts with mentions or reviews of kokkos.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2022-01-06.
-
Requesting suggestions for languages, libraries, and architectures for parallel (and sometimes non parallel) numerical and scientific computations
I’m a novice user of Kokkos. Write code once for openmp, CUDA, and other parallel execution backends. It was designed with scientific computing applications in mind. Some numerics tools are implemented in “Kokkos kernels”, most of the BLAS operations are included iirc.
-
My first non-trivial project in C++ and MPI/OpenMP
I would suggest using a C++ abstraction around thread parallelism. This will make your code easier to read and more concise, and will also make it easier to switch between different thread-parallel programming models. Kokkos is a lovely example of such an abstraction, but there are others. Modern C++ even has thread-parallel standard algorithms. Bryce Adelstein Lelbach's CppCon 2021 talk describes these.
-
Is there an OOP-wrapper library for cublas?
It’s a work in progress, but Kokkos and the associated Kokkos Kernels are probably the closest thing to what you’re asking for.
-
pykokkos-base available in PyPi (numpy and cupy array interoperability)
Kokkos implements a programming model in C++ for writing performance portable applications targeting all major HPC platforms. It provides abstractions for both parallel execution of code and data management with a variety of backends including, but not limited to: CUDA, HIP, OpenMP, HPX, and Pthreads, with backends for OpenMPTarget and SYCL currently under development.
kokkos-kernels
Posts with mentions or reviews of kokkos-kernels.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2021-08-09.
-
Is there an OOP-wrapper library for cublas?
It’s a work in progress, but Kokkos and the associated Kokkos Kernels are probably the closest thing to what you’re asking for.
What are some alternatives?
When comparing kokkos and kokkos-kernels you can also consider the following projects:
RAJA - RAJA Performance Portability Layer (C++)
oneMKL - oneAPI Math Kernel Library (oneMKL) Interfaces
pykokkos - Performance portable parallel programming in Python.
mdspan - Reference implementation of mdspan targeting C++23
rocBLAS - Next generation BLAS implementation for ROCm platform
Taskflow - A General-purpose Parallel and Heterogeneous Task Programming System
kronmult993 - CPU and GPU implementations of kronmult.
kokkos-python - Python bindings for data interoperability with Kokkos (View, DynRankView)
cu - package cu provides an idiomatic interface to the CUDA Driver API.
stdBLAS - Reference Implementation for stdBLAS