Abseil Common Libraries (C++)
This thought has been churning around in my mind for some years now — we focus too much on processing speed and reductions in time complexity that and not enough on increasing the size and efficiency of our cache and stack.
MM (especially MM on large type numbers like e.g. hashing algorithms) are very reliant on the cache because you can’t always fit that big of a number into a register. Side note, I was reading some Abseil code last night that did some funky but twiddling on ARM: https://github.com/abseil/abseil-cpp/blob/master/absl/hash/i...
Off the top of my head, isn’t it about 200ms to query, bus, and read something from memory? Just a thought, perhaps the cache and memory is where we should focus our efforts.
BLAS-like Library Instantiation Software Framework
However, on recent CPUs 4x4 is small for the innermost block size of the non-trivial hierarchy you need. You can see examples under https://github.com/flame/blis/tree/master/config with an a priori procedure for determining them in https://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analyti... (but compare with what's actually used for SKX, in particular). OpenBLAS will normally be similar, though it may come out somewhat faster, but it's easier to see in BLIS.
Build time-series-based applications quickly and at scale.. InfluxDB is the Time Series Data Platform where developers build real-time applications for analytics, IoT and cloud-native services in less time with less code.
Column Vectors vs. Row Vectors
1 project | news.ycombinator.com | 27 Oct 2022
BLIS: Portable software framework for high-performance linear algebra
1 project | news.ycombinator.com | 17 Aug 2022
BLAS-Like Library Instantiation Software Framework
1 project | news.ycombinator.com | 4 Jan 2022
Terrible Scaling on AMD Epyc 7662
1 project | reddit.com/r/HPC | 14 Apr 2021
Best usage for optional compilation of accelerating instructions ?
1 project | reddit.com/r/C_Programming | 3 Apr 2021