Matrix Multiplication Inches Closer To Mythic Goal

This page summarizes the projects mentioned and recommended in the original post on

Our great sponsors
  • InfluxDB - Build time-series-based applications quickly and at scale.
  • Zigi - The context switching struggle is real
  • Scout APM - Truly a developer’s best friend
  • Sonar - Free webinar: The Power of Clean C++
  • abseil-cpp

    Abseil Common Libraries (C++)

    This thought has been churning around in my mind for some years now — we focus too much on processing speed and reductions in time complexity that and not enough on increasing the size and efficiency of our cache and stack.

    MM (especially MM on large type numbers like e.g. hashing algorithms) are very reliant on the cache because you can’t always fit that big of a number into a register. Side note, I was reading some Abseil code last night that did some funky but twiddling on ARM:

    Off the top of my head, isn’t it about 200ms to query, bus, and read something from memory? Just a thought, perhaps the cache and memory is where we should focus our efforts.

  • blis

    BLAS-like Library Instantiation Software Framework

    However, on recent CPUs 4x4 is small for the innermost block size of the non-trivial hierarchy you need. You can see examples under with an a priori procedure for determining them in (but compare with what's actually used for SKX, in particular). OpenBLAS will normally be similar, though it may come out somewhat faster, but it's easier to see in BLIS.

  • InfluxDB

    Build time-series-based applications quickly and at scale.. InfluxDB is the Time Series Data Platform where developers build real-time applications for analytics, IoT and cloud-native services in less time with less code.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts