xsimd
DOKSparse
xsimd | DOKSparse | |
---|---|---|
3 | 2 | |
2,043 | 2 | |
1.4% | - | |
8.7 | 4.2 | |
1 day ago | 10 months ago | |
C++ | Cuda | |
BSD 3-clause "New" or "Revised" License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
xsimd
-
GDlog: A GPU-Accelerated Deductive Engine
https://github.com/xtensor-stack/xsimd
GH topics > HashMap:
-
SIMD intrinsics and the possibility of a standard library solution
xsimd - 1.6K GH stars
-
SPO600 project part 1
I've decided to switch to something better, and after a few hours of searching, I found this repository: NSIMD https://github.com/agenium-scale/nsimd FastDifferentialCoding https://github.com/lemire/FastDifferentialCoding VS https://github.com/VcDevel/Vc XSIMD https://github.com/xtensor-stack/xsimd
DOKSparse
- GDlog: A GPU-Accelerated Deductive Engine
-
tensor.to_sparse() Memory Allocation
If using sparse tensors is a must, you can look into DOK sparse format, which is supported for 2d matrices in scipy. it kinda allows you to access any element of the sparse tensor in constant time, which makes it possible to create your tensor directly in sparse format, skipping the need to create a dense numpy array first. In case you need a GPU version of this, I have a library that implements sparse dok tensor in pytorch and cuda. currently it's GPU only.
What are some alternatives?
highway - Performance-portable, length-agnostic SIMD with runtime dispatch
cub - [ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl
Vc - SIMD Vector Classes for C++
MegBA - MegBA: A GPU-Based Distributed Library for Large-Scale Bundle Adjustment
libsimdpp - Portable header-only C++ low level SIMD library
CUDA-Guide - CUDA Guide
nsimd - Agenium Scale vectorization library for CPUs and GPUs
cuhnsw - CUDA implementation of Hierarchical Navigable Small World Graph algorithm
FastDifferentialCoding - Fast differential coding functions (using SIMD instructions)
TorchPQ - Approximate nearest neighbor search with product quantization on GPU in pytorch and cuda
optuna - A hyperparameter optimization framework
instant-ngp - Instant neural graphics primitives: lightning fast NeRF and more