camellia-simd-aesni
sleef
camellia-simd-aesni | sleef | |
---|---|---|
1 | 17 | |
13 | 594 | |
- | - | |
1.2 | 8.1 | |
about 1 year ago | 19 days ago | |
C | C | |
MIT License | Boost Software License 1.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
camellia-simd-aesni
-
Linux 6.5 Last Minute Fixes a Performance Regression, 34% Drop in a Benchmark
> camellia_aesni_avx_x86_64
An interesting point here is that AES-NI can be used to accelerate a host of things other than AES. In this case, it's because the S-box can take advantage of the AES S-Box (SubBytes) instruction: https://github.com/jkivilin/camellia-simd-aesni; https://kernel.googlesource.com/pub/scm/linux/kernel/git/sha....
Similar acceleration has been done with SM4, the Chinese analogue of AES. https://github.com/mjosaarinen/sm4ni
sleef
-
The Case of the Missing SIMD Code
I'm the main author of Highway, so I have some opinions :D Number of operations/platforms supported are important criteria.
A hopefully unbiased commentary:
Simde allows you to take existing nonportable intrinsics and get them to run on another platform. This is useful when you have a bunch of existing code and tight deadlines. The downside is less than optimal performance - a portable abstraction can be more efficient than forcing one platform to exactly match the semantics of another. Although a ton of effort has gone into Simde, sometimes it also resorts to autovectorization which may or may not work.
Eigen and SLEEF are mostly math-focused projects that also have a portability layer. SLEEF is designed for C and thus has type suffixes which are rather verbose, see https://github.com/shibatch/sleef/blob/master/src/libm/sleef... But it offers a complete (more so than Highway's) libm.
-
Does anyone have any interest in my deep-learning framework?
But the other part about SIMD: I'm unsure if mgl-mat uses SIMD for transcendental functions or even for something like element-wise multiplication and division*. SIMD easily provides a speed-boost of 4-8 times which numpy uses. Libraries like sleef have been put to use by many.
- `constexpr` what?
- Advice on porting glibc trig functions to SIMD
-
SIMD intrinsics and the possibility of a standard library solution
Highway and Agner's VectorClass also have math functions. And SLEEF should definitely be mentioned.
-
Portable SIMD library
"SIMD Library for Evaluating Elementary Functions, vectorized libm and DFT" - https://github.com/shibatch/sleef
- SIMD Library for Evaluating Elementary Functions, Vectorized Libm and DFT
-
C library for multiple-precision floating-point arithmetic with correct rounding
Not mentioned in the list of users is SLEEF (https://github.com/shibatch/sleef), which provides fast approximations for various elementary functions. (It generates coefficients for the approximations with mpfr)
SLEEF itself is used by PyTorch.
-
How to speed up array writes?
If you are looking at floats, there's https://sleef.org
-
Benchmarking sine approximations and interpolators.
It would be interesting to see SLEEF added in the benchmarks.
What are some alternatives?
simd_utils - A header only library implementing common mathematical functions using SIMD intrinsics
nsimd - Agenium Scale vectorization library for CPUs and GPUs
sm4ni - Demonstration that AES-NI instructions can be used to implement the Chinese Encryption Standard SM4
yenten-arm-miner-yespowerr16 - ARM 64 CPU miner for Yespower variant algorithms
simde - Implementations of SIMD instruction sets for systems which don't natively support them.
sb-simd - A convenient SIMD interface for SBCL.
Unicorn Engine - Unicorn CPU emulator framework (ARM, AArch64, M68K, Mips, Sparc, PowerPC, RiscV, S390x, TriCore, X86)
vector-libm
crlibm - A mirror of the CRLibm project from INRIA Forge
xbyak_aarch64
rlibm-32 - RLibm for 32-bit representations (float and posit32)
FftSharp - A .NET Standard library for computing the Fast Fourier Transform (FFT) of real or complex data