Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Of course! Appreciate all the time you put in. I added a few more optimizations to qsort after that (see https://github.com/intel/x86-simd-sort/pull/33), just wanted to know if your analysis took that into account.
Cross-posting from https://news.ycombinator.com/item?id=36273544:
We looked into this [1] and conclude:
- throttling is a non-issue on Xeon Gold/Platinum;
- AVX-512 startup overhead can hurt on Skylake, but AVX-512 is still a net win for data sizes >= 100 KiB.
- Startup overhead is a non-issue on Icelake and AMD Zen4.
1: https://github.com/google/highway/blob/master/hwy/contrib/so...
Related posts
- Llamafile 0.7 Brings AVX-512 Support: 10x Faster Prompt Eval Times for AMD Zen 4
- Permuting Bits with GF2P8AFFINEQB
- AMD EPYC 97x4 “Bergamo” CPUs: 128 Zen 4c CPU Cores for Servers, Shipping Now
- The Most Useful Numbers You've Never Heard Of (Veritasium video on p-adic numbers)
- Intel Publishes Blazing Fast AVX-512 Sorting Library, Numpy Switching To It For 10~17x Faster Sorts