-
zig
General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
This has been mentioned before:
https://github.com/ziglang/zig/issues/7702
And I have actually taken a crack at implementing the AVX512 intrinsics into the Zig compiler as builtin functions on my personal fork of the repo. But it is a non-trivial task - there are over 450 distinct instructions across the entire AVX512 feature set, and over 100 for AVX2. And I'm only focusing on support for the LLVM backend, which does the heavy lifting in the codegen phase - for the self-hosted backend, getting the register allocation and instruction scheduling correct for all the instrunsics is going to involve lots of trial and effort.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
I wish more people understand that you absolutely need such intrinsics for fast software, there is no way around that.
https://github.com/AuburnSounds/intel-intrinsics
-
Indeed. This is how ripgrep works. It's compiled for just plain `x86_64`, but it looks for whether things like AVX2 are enabled. And if so, uses vector algorithms for substring and multi-substring search. The nice thing about dealing with strings is that the "coarse" requirement is already somewhat natural to the domain.
But, this functionality is absolutely critical. It doesn't even have to be automatic. Just the ability to compile functions with certain ISA extensions enabled, and then only call them when the requisite CPU features are enabled is enough.
In a nutshell: https://github.com/BurntSushi/memchr/blob/8037d11b4357b0f07b...
-
Implementing Arm semantics or x86 on the other requires ~5 instructions, but if we generalize the definition to allow reordering (e.g. Highway's ReorderWidenMulAccumulate [1]), it's only 2 instructions.
1: https://github.com/google/highway/blob/master/g3doc/quick_re...