slas
blis
Our great sponsors
slas  blis  

2  9  
33  1,380  
  2.2%  
9.1  8.3  
5 months ago  about 18 hours ago  
Rust  C  
Apache License 2.0  GNU General Public License v3.0 or later 
Stars  the number of stars that a project has on GitHub. Growth  month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
slas

What's everyone working on this week (4/2022)?
Been working on a linear algebra system design to be fast when working with statically shaped data. Its called slas and is on version 0.2.1.
blis

Small Neural networks in Julia 5x faster than PyTorch
The article asks "Which Microoptimizations matter for BLAS3?", implying small dimensions, but doesn't actually tell me. The problem is wellstudied, depending on what you consider "small". The most important thing is to avoid the packing step below an appropriate threshold. Implementations include libxsmm, blasfeo, and the "sup" version in blis (with papers on libxsmm and blasfeo). Eigen might also be relevant.
 Eigen: A C++ template library for linear algebra

Matrix Multiplication Inches Closer To Mythic Goal
However, on recent CPUs 4x4 is small for the innermost block size of the nontrivial hierarchy you need. You can see examples under https://github.com/flame/blis/tree/master/config with an a priori procedure for determining them in https://www.cs.utexas.edu/users/flame/pubs/TOMSBLISAnalyti... (but compare with what's actually used for SKX, in particular). OpenBLAS will normally be similar, though it may come out somewhat faster, but it's easier to see in BLIS.
What are some alternatives?
tinycudann  Lightning fast C++/CUDA neural network framework
vectorflow
howtooptimizegemm
blasfeo  Basic linear algebra subroutines for embedded optimization
abseilcpp  Abseil Common Libraries (C++)
rofipass  rofi frontend for pass
sundials  Official development repository for SUNDIALS  a SUite of Nonlinear and DIfferential/ALgebraic equation Solvers. Pull requests are welcome for bug fixes and minor changes.
diffrax  Numerical differential equation solvers in JAX. Autodifferentiable and GPUcapable. https://docs.kidger.site/diffrax/
juliaup  Julia installer and version multiplexer
neanderthal  Fast Clojure Matrix Library
DirectXMath  DirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps
laser  The HPC toolbox: fused matrix multiplication, convolution, dataparallel strided tensor primitives, OpenMP facilities, SIMD, JIT Assembler, CPU detection, stateoftheart vectorized BLAS for floats and integers