SaaSHub helps you find the best software and product alternatives Learn more →
Top 8 intrinsic Open-Source Projects
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
camellia-simd-aesni
Camellia cipher SIMD vector implementations for x86 (with AES-NI, VAES and/or GFNI instructions), ARM (with ARMv8 Crypto Extension instructions) and POWER (with VMX+VSX+crypto instructions)
Project mention: Llamafile 0.7 Brings AVX-512 Support: 10x Faster Prompt Eval Times for AMD Zen 4 | news.ycombinator.com | 2024-03-31The bf16 dot instruction replaces 6 instructions: https://github.com/google/highway/blob/master/hwy/ops/x86_12...
Project mention: Distil-Whisper: distilled version of Whisper that is 6 times faster, 49% smaller | news.ycombinator.com | 2023-10-31Just a point of clarification - faster-whisper references it but ctranslate2[0] is what's really doing the magic here.
Ctranslate2 is a sleeper powerhouse project that enables a lot. They should be up front and center and get the credit they deserve.
[0] - https://github.com/OpenNMT/CTranslate2
I wish more people understand that you absolutely need such intrinsics for fast software, there is no way around that.
https://github.com/AuburnSounds/intel-intrinsics
Project mention: Linux 6.5 Last Minute Fixes a Performance Regression, 34% Drop in a Benchmark | news.ycombinator.com | 2023-08-28> camellia_aesni_avx_x86_64
An interesting point here is that AES-NI can be used to accelerate a host of things other than AES. In this case, it's because the S-box can take advantage of the AES S-Box (SubBytes) instruction: https://github.com/jkivilin/camellia-simd-aesni; https://kernel.googlesource.com/pub/scm/linux/kernel/git/sha....
Similar acceleration has been done with SM4, the Chinese analogue of AES. https://github.com/mjosaarinen/sm4ni
intrinsics related posts
- Llamafile 0.7 Brings AVX-512 Support: 10x Faster Prompt Eval Times for AMD Zen 4
- Permuting Bits with GF2P8AFFINEQB
- AMD EPYC 97x4 “Bergamo” CPUs: 128 Zen 4c CPU Cores for Servers, Shipping Now
- 10~17x faster than what? A performance analysis of Intel' x86-SIMD-sort(AVX-512)
- The Most Useful Numbers You've Never Heard Of (Veritasium video on p-adic numbers)
- SIMD with Zig
- Intel Publishes Blazing Fast AVX-512 Sorting Library, Numpy Switching To It For 10~17x Faster Sorts
-
A note from our sponsor - SaaSHub
www.saashub.com | 26 Apr 2024
Index
What are some of the best open-source intrinsic projects? This list will help you:
Project | Stars | |
---|---|---|
1 | highway | 3,645 |
2 | CTranslate2 | 2,776 |
3 | faster | 1,548 |
4 | simd_utils | 80 |
5 | intel-intrinsics | 66 |
6 | peakperf | 56 |
7 | safe_arch | 41 |
8 | camellia-simd-aesni | 13 |
Sponsored