SaaSHub helps you find the best software and product alternatives Learn more →
Top 14 C++ Avx2 Projects
-
simdjson
Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks
Project mention: Make Ubuntu packages 90% faster by rebuilding them | news.ycombinator.com | 2025-03-18I think parsing once into a faster format (sqlite3 or parquet) would be more beneficial.
https://simdjson.org/
-
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
I quite like highway.
As mentioned, last time I tried vqsort for RVV it was surprisingly slow.
I tried to replicate it yesterday, but noticed that vqsort is now disabled for RVV: https://github.com/google/highway/blob/400fbf20f2e40b984be12...
Does highway support sorting networks for non-128-bit vector registers?
When I tried to compile it for AVX512, the BaseCase seems to only use xmm registers: https://godbolt.org/z/qr9xoTGKn
-
Thanks for the added context on the builds! As "foreign" BW player and fellow speech processing researcher, I agree shallow contextual biasing should help. While not difficult to implement, most generally available ASR solutions don't make it easy to use. There's a PR in ctranslate2 implementing the same feature so that it could be exposed in faster-whisper: https://github.com/OpenNMT/CTranslate2/pull/1789
-
DirectXMath
DirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps
-
Project mention: Understanding SIMD: Infinite Complexity of Trivial Problems | news.ycombinator.com | 2024-11-30
I'm surprised no one has mentioned Vc. I found ispc clunky and not as performant, and std::simd didn't support some useful math ops like rsqrt. Vc has been around for years, I have no trouble including it in my codes, it has masking and many of the most useful math ops, and I can get over 1 TF/s on a consumer-grade Ryzen and at least 3 TF/s on the big Epyc CPUs.
https://github.com/VcDevel/Vc
-
simdutf
Unicode routines (UTF8, UTF16, UTF32) and Base64: billions of characters per second using SSE2, AVX2, NEON, AVX-512, RISC-V Vector Extension, LoongArch64, POWER. Part of Node.js, WebKit/Safari, Ladybird, Chromium, Cloudflare Workers and Bun.
Lemire and collaborators often write in C++ intrinsics, or thin platform-specific wrappers on top of them.
I count ~8 different implementations [1], which demonstrates considerable commitment :) Personally, I prefer to write once with portable intrinsics.
https://github.com/simdutf/simdutf/tree/1d5b5cd2b60850954df5...
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
Here is a bunch of simple examples: https://github.com/jfalcou/eve/blob/fb093a0553d25bb8114f1396...
I personally think we have the following strenghs:
* Algorithms. Writing SIMD loops is very hard. We give you a lot of ready to go loops. (find, search, remove, set_intersection to name a few).
-
Project mention: Copilot implemented a ThreadPool to serve as a replacement for OpenMP | news.ycombinator.com | 2025-05-09
-
-
-
-
-
Im upgrading avx2 to avx512 (where possible) in my reimplementation of RandomX algorithm: https://github.com/patrulek/modernRX
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
C++ Avx2 discussion
C++ Avx2 related posts
-
Copilot implemented a ThreadPool to serve as a replacement for OpenMP
-
Three Fundamental Flaws of SIMD
-
C Is Not Suited to SIMD
-
Intel Releases x86-SIMD-sort 6.0 for 10x faster AVX2/AVX-512 Sorting
-
User-Space Interrupts (2021)
-
Highway – Portable SIMD Library
-
SIMD-accelerated computer vision on a $2 microcontroller
-
A note from our sponsor - SaaSHub
www.saashub.com | 15 May 2025
Index
What are some of the best open-source Avx2 projects in C++? This list will help you:
# | Project | Stars |
---|---|---|
1 | simdjson | 20,284 |
2 | highway | 4,615 |
3 | CTranslate2 | 3,791 |
4 | DirectXMath | 1,652 |
5 | Vc | 1,483 |
6 | simdutf | 1,366 |
7 | libsimdpp | 1,273 |
8 | eve | 1,188 |
9 | x86-simd-sort | 931 |
10 | toys | 355 |
11 | sse-popcount | 337 |
12 | CPURasterizer | 179 |
13 | std_find_simd | 21 |
14 | modernRX | 14 |