matrixmultiply
rust-ndarray
Our great sponsors
matrixmultiply | rust-ndarray | |
---|---|---|
4 | 20 | |
202 | 3,307 | |
- | 2.9% | |
6.0 | 8.1 | |
about 1 month ago | 8 days ago | |
Rust | Rust | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
matrixmultiply
-
Help understanding the state of ndarrays and linalg in Rust.
The matrixmultiply crate from the ndarray author (https://github.com/bluss/matrixmultiply) is one such implementation. It uses the same algorithm as the BLIS project (https://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf) to partition the problem and exploit the cache hierarchy. It isn't as well tuned as eg. Intel MKL or BLIS, but the results are very respectable.
-
faer 0.8.0 release
Do you plan to support integers as native types? I know there is an issue for the crate matrixmultiply for that, it seems it can be problematic because of overflow.
-
Faster `matrixmultiply` ?
There's a famous crate [matrixmultiply](https://github.com/bluss/matrixmultiply) for matrix-matrix multiplication in Rust. But it's a bit slow for me.
-
Nim vs Rust Benchmarks
In my benchmarks, Nim is faster than Rust:
- multithreading runtime (i.e Rayon vs Weave https://github.com/mratsim/weave)
- Cryptography: https://hackmd.io/@gnark/eccbench#Pairing
- Scientific computing / matrix multiplication: https://github.com/bluss/matrixmultiply/issues/34#issuecomme...
There is no inherent reason why a Nim program would be slower than Rust.
rust-ndarray
-
Some Reasons to Avoid Cython
I would love some examples of how to do non-trivial data interop between Rust and Python. My experience is that PyO3/Maturin is excellent when converting between simple datatypes but conversions get difficult when there are non-standard types, e.g. Python Numpy arrays or Rust ndarrays or whatever other custom thing.
Polars seems to have a good model where it uses the Arrow in memory format, which has implementations in Python and Rust, and makes a lot of the ndarray stuff easier. However, if the Rust libraries are not written with Arrow first, they become quite hard to work with. For example, there are many libraries written with https://github.com/rust-ndarray/ndarray, which is challenging to interop with Numpy.
(I am not an expert at all, please correct me if my characterizations are wrong!)
-
Helper crate for working with image data of varying type?
Thanks for sharing. I read this issue on why ndarray does not have a dynamically typed array: https://github.com/rust-ndarray/ndarray/issues/651
-
What is the most efficient way to study Rust for scientific computing applications?
You can get involved with the ndarray project
-
faer 0.8.0 release
Sadly Ndarray does look a little abandoned to me: https://github.com/rust-ndarray/ndarray
-
Status and Future of ndarray?
The date of the last commit of [ndarray](https://github.com/rust-ndarray/ndarray) lies 6 month in the past while many recent issues are open and untouched.
-
How does explicit unrolling differ from iterating through elements one-by-one? (ndarray example)
While looking through ndarrays src, I came across a set of functions that explicitly unroll 8 variables on each iteration of a loop, with the comment eightfold unrolled so that floating point can be vectorized (even with strict floating point accuracy semantics). I don't understand why floats would be affected by unrolling, and in general I'm confused as to how explicit unrolling differs from iterating through each element one by one. I assumed this would be a scenario where the compiler would optimize best anyway, which seems to be confirmed (at least in the context of using iter() rather than for) here. Could anyone give a little context into what this, or any explicit unrolling achieves?
-
Announcing Burn: New Deep Learning framework with CPU & GPU support using the newly stabilized GAT feature
Burn is different: it is built around the Backend trait which encapsulates tensor primitives. Even the reverse mode automatic differentiation is just a backend that wraps another one using the decorator pattern. The goal is to make it very easy to create optimized backends and support different devices and use cases. For now, there are only 3 backends: NdArray (https://github.com/rust-ndarray/ndarray) for a pure rust solution, Tch (https://github.com/LaurentMazare/tch-rs) for an easy access to CUDA and cuDNN optimized operations and the ADBackendDecorator making any backend differentiable. I am now refactoring the internal backend API to make it as easy as possible to plug in new ones.
-
Pure rust implementation for deep learning models
Looks like it's an open request
-
The Illustrated Stable Diffusion
https://github.com/rust-ndarray/ndarray/issues/281
Answer: you can’t with this crate. I implemented a dynamic n-dim solution myself but it uses views of integer indices that get copied to a new array, which have indexes to another flattened array in order to avoid duplication of possibly massive amounts of n-dimensional data; using the crate alone, copying all the array data would be unavoidable.
Ultimately I’ve had to make my own axis shifting and windowing mechanisms. But the crate is still a useful lib and continuing effort.
While I don’t mind getting into the weeds, these kinds of side efforts can really impact context focus so it’s just something to be aware of.
-
Any efficient way of splitting vector?
In principle you're trying to convert between columnar and row-based data layouts, something that happens fairly often in data science. I bet there's some hyper-efficient SIMD magic that could be invoked for these slicing operations (and maybe the iterator solution does exactly that). Might be worth taking a look at how the relevant Rust libraries like ndarray do it.
What are some alternatives?
weave - A state-of-the-art multithreading runtime: message-passing based, fast, scalable, ultra-low overhead
nalgebra - Linear algebra library for Rust.
Programming-Language-Benchmarks - Yet another implementation of computer language benchmarks game
Rust-CUDA - Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.
Programming-Language-Benchmark
image - Encoding and decoding images in Rust
Graal - GraalVM compiles Java applications into native executables that start instantly, scale fast, and use fewer compute resources 🚀
neuronika - Tensors and dynamic neural networks in pure Rust.
faer-rs - Linear algebra foundation for the Rust programming language
utah - Dataframe structure and operations in Rust
matrixmultiply_mt - A Multithreaded, processor specialized, fork of the matrixmultiply crate
linfa - A Rust machine learning framework.