simd_utils
camellia-simd-aesni
Our great sponsors
simd_utils | camellia-simd-aesni | |
---|---|---|
1 | 1 | |
80 | 13 | |
- | - | |
6.6 | 1.2 | |
about 1 month ago | about 1 year ago | |
C | C | |
BSD 2-clause "Simplified" License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
simd_utils
-
Trying to convert permute AVX512 instruction to AVX2/AVX calls
https://github.com/JishinMaster/simd_utils/tree/master .
camellia-simd-aesni
-
Linux 6.5 Last Minute Fixes a Performance Regression, 34% Drop in a Benchmark
> camellia_aesni_avx_x86_64
An interesting point here is that AES-NI can be used to accelerate a host of things other than AES. In this case, it's because the S-box can take advantage of the AES S-Box (SubBytes) instruction: https://github.com/jkivilin/camellia-simd-aesni; https://kernel.googlesource.com/pub/scm/linux/kernel/git/sha....
Similar acceleration has been done with SM4, the Chinese analogue of AES. https://github.com/mjosaarinen/sm4ni
What are some alternatives?
highway - Performance-portable, length-agnostic SIMD with runtime dispatch
sm4ni - Demonstration that AES-NI instructions can be used to implement the Chinese Encryption Standard SM4
cglm - 📽 Highly Optimized 2D / 3D Graphics Math (glm) for C
sleef - SIMD Library for Evaluating Elementary Functions, vectorized libm and DFT
ara - The PULP Ara is a 64-bit Vector Unit, compatible with the RISC-V Vector Extension Version 1.0, working as a coprocessor to CORE-V's CVA6 core
simde - Implementations of SIMD instruction sets for systems which don't natively support them.
intel-intrinsics - The Dlang SIMD library
Unicorn Engine - Unicorn CPU emulator framework (ARM, AArch64, M68K, Mips, Sparc, PowerPC, RiscV, S390x, TriCore, X86)