Optimizing compilers reload vector constants needlessly

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

compiler-explorer

190 15,138 9.9 TypeScript

Run compilers interactively from your web browser and interact with the assembly

It should be part of these discussions to proof what you claim. Always. With code samples, directly to the compiler and corresponding assembler.
https://godbolt.org/
Statistics are worthless alone, at the end all that counts is the arena of performance and what the code becomes and how it runs against the handcrafted version.

OpenBLAS

22 5,952 9.8 C

OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.
InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
std-simd

9 544 1.1 C++

std::experimental::simd for GCC [ISO/IEC TS 19570:2018]

Bad news. For SIMD there are not cross-platform intrinsics. Intel intrinsics map directly to SSE/AVX instructions and ARM intrinsics map directly to NEON instructions.
For cross-platform, your best bet is probably https://github.com/VcDevel/std-simd
There's https://eigen.tuxfamily.org/index.php?title=Main_Page But, it's tremendously complicated for anything other than large-scale linear algebra.
And, there's https://github.com/microsoft/DirectXMath But, it has obvious biases :P

DirectXMath

13 1,481 6.8 C++

DirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps

Bad news. For SIMD there are not cross-platform intrinsics. Intel intrinsics map directly to SSE/AVX instructions and ARM intrinsics map directly to NEON instructions.
For cross-platform, your best bet is probably https://github.com/VcDevel/std-simd
There's https://eigen.tuxfamily.org/index.php?title=Main_Page But, it's tremendously complicated for anything other than large-scale linear algebra.
And, there's https://github.com/microsoft/DirectXMath But, it has obvious biases :P

FFmpeg

485 42,374 10.0 C

Mirror of https://git.ffmpeg.org/ffmpeg.git
highway

66 3,645 9.8 C++

Performance-portable, length-agnostic SIMD with runtime dispatch

__builtin_shufflevector requires a known vector length, and can be pessimized (fusing two into one general all-to-all permute which is more expensive than two simple shuffles).
Also, vqsort (https://github.com/google/highway/tree/master/hwy/contrib/so...) almost entirely consists of

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Vc 1.4.2 released: portable SIMD programming for C++
3 projects | /r/cpp | 23 Jun 2021
The Case of the Missing SIMD Code
7 projects | news.ycombinator.com | 8 Jun 2023
Similarity Measures on Arm SVE and NEON, x86 AVX2 and AVX-512
2 projects | /r/simd | 25 Mar 2023
Portable SIMD library
3 projects | /r/C_Programming | 15 Nov 2022
Where do people learn to write truly quick software?
2 projects | /r/rust | 2 Nov 2021

Optimizing compilers reload vector constants needlessly

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Simd Neon Avx Sse Avx2
Post date: 6 Dec 2022

compiler-explorer

OpenBLAS

InfluxDB

std-simd

DirectXMath

FFmpeg

highway

Related posts

Optimizing compilers reload vector constants needlessly

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com Simd Neon Avx Sse Avx2 Post date: 6 Dec 2022

compiler-explorer

OpenBLAS

InfluxDB

std-simd

DirectXMath

FFmpeg

highway

Related posts

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Simd Neon Avx Sse Avx2
Post date: 6 Dec 2022