matrixmultiply
weave
Our great sponsors
matrixmultiply | weave | |
---|---|---|
4 | 7 | |
202 | 519 | |
- | - | |
6.0 | 3.0 | |
about 2 months ago | 5 months ago | |
Rust | Nim | |
Apache License 2.0 | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
matrixmultiply
-
Help understanding the state of ndarrays and linalg in Rust.
The matrixmultiply crate from the ndarray author (https://github.com/bluss/matrixmultiply) is one such implementation. It uses the same algorithm as the BLIS project (https://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf) to partition the problem and exploit the cache hierarchy. It isn't as well tuned as eg. Intel MKL or BLIS, but the results are very respectable.
-
faer 0.8.0 release
Do you plan to support integers as native types? I know there is an issue for the crate matrixmultiply for that, it seems it can be problematic because of overflow.
-
Faster `matrixmultiply` ?
There's a famous crate [matrixmultiply](https://github.com/bluss/matrixmultiply) for matrix-matrix multiplication in Rust. But it's a bit slow for me.
-
Nim vs Rust Benchmarks
In my benchmarks, Nim is faster than Rust:
- multithreading runtime (i.e Rayon vs Weave https://github.com/mratsim/weave)
- Cryptography: https://hackmd.io/@gnark/eccbench#Pairing
- Scientific computing / matrix multiplication: https://github.com/bluss/matrixmultiply/issues/34#issuecomme...
There is no inherent reason why a Nim program would be slower than Rust.
weave
- The GIL can now be disabled in Python's main branch
-
Maybe Everything Is a Coroutine
GPU drivers provide an event system:
- Cuda: https://github.com/mratsim/weave/issues/133
-
Benchmarking 20 programming languages on N-queens and matrix multiplication
```
Note: the Theoretical peak limit is hardcoded and used my previous machine i9-9980XE.
It maybe that your BLAS library is not named libopenblas.so, you can change that here: https://github.com/mratsim/laser/blob/master/benchmarks/thir...
Implementation is in this folder: https://github.com/mratsim/laser/tree/master/laser/primitive...
in particular, tiling, cache and register optimization: https://github.com/mratsim/laser/blob/master/laser/primitive...
AVX512 code generator: https://github.com/mratsim/laser/blob/master/laser/primitive...
And generic Scalar/SSE/AVX/AVX2/AVX512 microkernel generator (this is Nim macros to generate code at compile-time): https://github.com/mratsim/laser/blob/master/laser/primitive...
I'll come back later with details on how to use my custom HPC threadpool Weave instead of OpenMP (https://github.com/mratsim/weave/tree/master/benchmarks/matm...)
-
Nim vs Rust Benchmarks
In my benchmarks, Nim is faster than Rust:
- multithreading runtime (i.e Rayon vs Weave https://github.com/mratsim/weave)
- Cryptography: https://hackmd.io/@gnark/eccbench#Pairing
- Scientific computing / matrix multiplication: https://github.com/bluss/matrixmultiply/issues/34#issuecomme...
There is no inherent reason why a Nim program would be slower than Rust.
-
Aren't green threads just better than async/await?
If you're interested into diving into this I have reviewed solutions to cactus stacks / split stacks here https://github.com/mratsim/weave/blob/master/weave/memory/multithreaded_memory_management.md
-
Nim 2.0 – Thoughts
[4] https://github.com/mratsim/weave
What are some alternatives?
rust-ndarray - ndarray: an N-dimensional array with array views, multidimensional slicing, and efficient operations
eioio - Effects-based direct-style IO for multicore OCaml
Programming-Language-Benchmarks - Yet another implementation of computer language benchmarks game
httpbeast - A highly performant, multi-threaded HTTP 1.1 server written in Nim.
Programming-Language-Benchmark
Edith - Electronic Design in Swithft
Graal - GraalVM compiles Java applications into native executables that start instantly, scale fast, and use fewer compute resources 🚀
ocaml-multicore - Multicore OCaml
faer-rs - Linear algebra foundation for the Rust programming language
cosmopolitan - build-once run-anywhere c library
matrixmultiply_mt - A Multithreaded, processor specialized, fork of the matrixmultiply crate
roast - 🦋 Raku test suite