SaaSHub helps you find the best software and product alternatives Learn more →
Top 15 C++ Avx2 Projects
-
simdjson
Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
StringZilla
Up to 10x faster strings for C, C++, Python, Rust, and Swift, leveraging SWAR and SIMD on Arm Neon and x86 AVX2 & AVX-512-capable chips to accelerate search, sort, edit distances, alignment scores, etc 🦖
-
DirectXMath
DirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
simdutf
Unicode routines (UTF8, UTF16, UTF32) and Base64: billions of characters per second using SSE2, AVX2, NEON, AVX-512, RISC-V Vector Extension. Part of Node.js and Bun.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Project mention: Tips on adding JSON output to your command line utility. (2021) | news.ycombinator.com | 2024-04-20It's also supported by simdjson [0] (which has a lot of language bindings [1]):
> Multithreaded processing of gigantic Newline-Delimited JSON (ndjson) and related formats at 3.5 GB/s
[0] https://simdjson.org/
[0] https://github.com/simdjson/simdjson?tab=readme-ov-file#bind...
Project mention: Llamafile 0.7 Brings AVX-512 Support: 10x Faster Prompt Eval Times for AMD Zen 4 | news.ycombinator.com | 2024-03-31The bf16 dot instruction replaces 6 instructions: https://github.com/google/highway/blob/master/hwy/ops/x86_12...
Project mention: Distil-Whisper: distilled version of Whisper that is 6 times faster, 49% smaller | news.ycombinator.com | 2023-10-31Just a point of clarification - faster-whisper references it but ctranslate2[0] is what's really doing the magic here.
Ctranslate2 is a sleeper powerhouse project that enables a lot. They should be up front and center and get the credit they deserve.
[0] - https://github.com/OpenNMT/CTranslate2
Project mention: Measuring energy usage: regular code vs. SIMD code | news.ycombinator.com | 2024-02-19The 3.5x energy-efficiency gap between serial and SIMD code becomes even larger when
A. you do byte-level processing instead of float words;
B. you use embedded, IoT, and other low-energy devices.
A few years ago I've compared Nvidia Jetson Xavier (long before the Orin release), Intel-based MacBook Pro with Core i9, and AVX-512 capable CPUs on substring search benchmarks.
On Xavier one can quite easily disable/enable cores and reconfigure power usage. At peak I got to 4.2 GB/J which was an 8.3x improvement in inefficiency over LibC in substring search operations. The comparison table is still available in the older README: https://github.com/ashvardanian/StringZilla/tree/v2.0.2?tab=...
Project mention: SIMD based custom object and key-value pair sorting in C++ | news.ycombinator.com | 2024-02-14
I think all of these techniques check whether the input string is correct. For example see here https://github.com/WojciechMula/toys/blob/master/lookup-in-s...
C++ Avx2 related posts
- Tips on adding JSON output to your command line utility. (2021)
- Llamafile 0.7 Brings AVX-512 Support: 10x Faster Prompt Eval Times for AMD Zen 4
- Training great LLMs from ground zero in the wilderness as a startup
- Measuring energy usage: regular code vs. SIMD code
- From slow to SIMD: A Go optimization story
- simdjson: Parsing Gigabytes of JSON per Second
- Cray-1 performance vs. modern CPUs
-
A note from our sponsor - SaaSHub
www.saashub.com | 25 Apr 2024
Index
What are some of the best open-source Avx2 projects in C++? This list will help you:
Project | Stars | |
---|---|---|
1 | simdjson | 18,362 |
2 | highway | 3,623 |
3 | CTranslate2 | 2,776 |
4 | StringZilla | 1,776 |
5 | DirectXMath | 1,481 |
6 | Vc | 1,417 |
7 | libsimdpp | 1,188 |
8 | simdutf | 948 |
9 | eve | 843 |
10 | x86-simd-sort | 794 |
11 | toys | 311 |
12 | sse-popcount | 309 |
13 | CPURasterizer | 155 |
14 | EveryCulling | 120 |
15 | std_find_simd | 18 |
Sponsored