-
And here's an example on how to add two floats using Rust-CUDA: https://github.com/Rust-GPU/Rust-CUDA/blob/master/examples/cuda/gpu/add_gpu/src/lib.rs
-
CodeRabbit
CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
-
If you just want to do a matrix multiplication with CUDA (and not inside some CUDA code), you should use cuBLAS rather than CUTLASS (here is some wrapper code I wrote and the corresponding helper functions if your difficulty is using the library rather than linking it / building), it is a fairly straightforward BLAS replacement (it can be a pain to install but that is life with C++/nvidia).
-
jax
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
If you just want to do some numerical code that requires linear algebra and GPU, your best bet would be Julia or Python+JAX.
-
spack
A flexible package manager that supports multiple versions, configurations, platforms, and compilers.
Trilinos is a pain to install and get working, I recommend using Spack or a similar tool to deal with it.
-
If you do not need GPU then I would recommend looking into Eigen in C++, nalgebra in Rust (with a BLAS in both cases for improved performance) or one of the above options (Julia / Python+JAX).
-
If you just want to do some numerical code that requires linear algebra and GPU, your best bet would be Julia or Python+JAX.
Related posts
-
Building a compile-time SIMD optimized smoothing filter
-
Genetically synthesized supergain broadband wire-bundle antenna
-
Show HN: Adding Mistral Codestral and GPT-4o to Jupyter Notebooks
-
Generics in Rust: murky waters of implementing foreign traits on foreign types
-
Rebuilding TensorFlow 2.8.4 on Ubuntu 22.04 to patch vulnerabilities