blis vs sundials

blis

BLAS-like Library Instantiation Software Framework (by flame)

Source Code

Suggest alternative

Edit details

sundials

Official development repository for SUNDIALS - a SUite of Nonlinear and DIfferential/ALgebraic equation Solvers. Pull requests are welcome for bug fixes and minor changes. (by LLNL)

ode-solver dae-solver nonlinear-equation-solver sensitivity-analysis time-integration Scientific Computing parallel-computing HPC math-physics radiuss Solver high-performance-computing

Source Code

computing.llnl.gov

Suggest alternative

Edit details

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

blis		sundials
	Project
16	Mentions	1
2,073	Stars	449
4.1%	Growth	2.0%
7.1	Activity	8.6
8 days ago	Latest Commit	6 days ago
C	Language	C
GNU General Public License v3.0 or later	License	BSD 3-clause "New" or "Revised" License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

blis

Posts with mentions or reviews of blis. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-02-28.

Optimize sgemm on RISC-V platform
6 projects | news.ycombinator.com | 28 Feb 2024

There is a recent update to the blis alternative to BLAS that includes a number of RISC-V performance optimizations.
https://github.com/flame/blis/pull/737
BLIS: Portable basis for high-performance BLAS-like linear algebra libs
2 projects | news.ycombinator.com | 24 Jan 2024

https://github.com/flame/blis/blob/master/docs/Performance.m...
It seems that the selling point is that BLIS does multi-core quite well. I am especially impressed that it does as well as the highly optimized Intel's MKL on Intel's CPUs.
I do not see the selling point of BLIS-specific APIs, though. The whole point of having an open BLAS API standard is that numerical libraries should be drop-in replaceable, so when a new library (such as BLIS here) comes along, one could just re-link the library and reap the performance gain immediately.
What is interesting is that numerical algebra work, by nature, is mostly embarrassingly parallel, so it should not be too difficult to write multi-core implementations. And yet, BLIS here performs so much better than some other industry-leading implementations on multi-core configurations. So the question is not why BLIS does so well; the question is why some other implementations do so poorly.

2 projects | news.ycombinator.com | 24 Jan 2024
Benchmarking 20 programming languages on N-queens and matrix multiplication
15 projects | news.ycombinator.com | 2 Jan 2024

First we can use Laser, which was my initial BLAS experiment in 2019. At the time in particular, OpenBLAS didn't properly use the AVX512 VPUs. (See thread in BLIS https://github.com/flame/blis/issues/352 ), It has made progress since then, still, on my current laptop perf is in the same range
Reproduction:
The Art of High Performance Computing
4 projects | news.ycombinator.com | 30 Dec 2023

https://github.com/flame/blis/
Field et al, recent winners of the James H. Wilkinson Prize for Numerical Software.
Field and Goto both worked with Robert van de Geijn. Lots of TACC interaction in that broader team.
[D] Which BLAS library to choose for apple silicon?
2 projects | /r/MachineLearning | 24 May 2023

BLIS is fine too~ https://github.com/flame/blis
Small Neural networks in Julia 5x faster than PyTorch
8 projects | news.ycombinator.com | 14 Apr 2022

The article asks "Which Micro-optimizations matter for BLAS3?", implying small dimensions, but doesn't actually tell me. The problem is well-studied, depending on what you consider "small". The most important thing is to avoid the packing step below an appropriate threshold. Implementations include libxsmm, blasfeo, and the "sup" version in blis (with papers on libxsmm and blasfeo). Eigen might also be relevant.
https://libxsmm.readthedocs.io/
https://blasfeo.syscop.de/
https://github.com/flame/blis
Eigen: A C++ template library for linear algebra
6 projects | news.ycombinator.com | 4 Apr 2022
Matrix Multiplication Inches Closer To Mythic Goal
2 projects | news.ycombinator.com | 18 Dec 2021

However, on recent CPUs 4x4 is small for the innermost block size of the non-trivial hierarchy you need. You can see examples under https://github.com/flame/blis/tree/master/config with an a priori procedure for determining them in https://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analyti... (but compare with what's actually used for SKX, in particular). OpenBLAS will normally be similar, though it may come out somewhat faster, but it's easier to see in BLIS.

sundials

Posts with mentions or reviews of sundials. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-02-18.

Scientific computing in Cpp
2 projects | /r/cpp | 18 Feb 2021

What are some alternatives?

When comparing blis and sundials you can also consider the following projects:

tiny-cuda-nn - Lightning fast C++/CUDA neural network framework

Fastor - A lightweight high performance tensor algebra framework for modern C++

DAGSfM - Distributed and Graph-based Structure from Motion. This project includes the official implementation of our Pattern Recognition 2020 paper: Graph-Based Parallel Large Scale Structure from Motion.

vectorflow

how-to-optimize-gemm

DirectXMath - DirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps

xtensor - C++ tensors with broadcasting and lazy computing

diffrax - Numerical differential equation solvers in JAX. Autodifferentiable and GPU-capable. https://docs.kidger.site/diffrax/

blasfeo - Basic linear algebra subroutines for embedded optimization

LeNetTorch - PyTorch implementation of LeNet for fitting MNIST for benchmarking.

slas - Static Linear Algebra System

juliaup - Julia installer and version multiplexer

blis vs tiny-cuda-nn sundials vs Fastor sundials vs DAGSfM blis vs vectorflow blis vs how-to-optimize-gemm blis vs DirectXMath blis vs xtensor blis vs diffrax blis vs blasfeo blis vs LeNetTorch blis vs slas blis vs juliaup

Compare blis vs sundials and see what are their differences.

blis

sundials

blis

sundials

What are some alternatives?