Sevalla is the PaaS you have been looking for! Advanced deployment pipelines, usage-based pricing, preview apps, templates, human support by developers, and much more! Learn more โ
Top 23 C++ Simd Projects
-
ncnn
ncnn is a high-performance neural network inference framework optimized for the mobile platform
-
JetBrains
Tell us how you use coding tools. You may win a prize! Are you a developer or a data analyst? Share your thoughts about your coding tools in our short survey and get a chance to win prizes!
-
simdjson
Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks
-
As for math, that was the easiest choice as of yet. No doubt, GLM is a "gold standard" at this point. For OpenGL it is, at least. But, like with a lot of the other APIs, I decided to build a wrapper around it rather than directly reference the library in the engine's code. And for physics, well, I had not come upon that answer just yet. I did try to make my own physics logic at some point. And while it was, surprisingly, successful, I wanted more than just a simple physics layer. I wanted something more complex and, more importantly, faster than my implementation. I have not decided upon a physics library yet. But I'll cross that bridge when I come to it.
-
Project mention: SIMD Perlin Noise: Beating the Compiler with SSE | news.ycombinator.com | 2025-07-23
Yes indeed, it's about 500 LOC in https://github.com/google/highway/blob/master/hwy/ops/generi....
-
usearch
Fast Open-Source Search & Clustering engine ร for Vectors & Arbitrary Objects ร in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram ๐
In case you are searching for fun use-cases, here's how one experiment with weird similarity metrics & kNN data-structures via Cppyy (for C++ kernel), Numba (for Python), or PeachPy (for x86 Asm), interacting with a precompiled engine: https://github.com/unum-cloud/usearch/blob/main/python/READM...
-
This flexibility of CUDA is a software facility, which is independent of the hardware implementation.
For any SIMD processor one can write a compiler that translates a program written for the SIMT model into SIMD instructions. For example, for the Intel/AMD CPUs with SSE4/AVX/AVX-512 ISAs, there exists a compiler of this kind (ispc: https://github.com/ispc/ispc).
-
-
Sevalla
Deploy and host your apps and databases, now with $50 credit! Sevalla is the PaaS you have been looking for! Advanced deployment pipelines, usage-based pricing, preview apps, templates, human support by developers, and much more!
-
xsimd
C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE))
Thanks, that's an important caveat!
> Meanwhile xsimd (https://github.com/xtensor-stack/xsimd) has the feature level as a template parameter on its vector objects
That's pretty cool because you can write function templates and instantiate different versions that you can select at runtime!
-
Simd
C++ image processing and machine learning library with using of SIMD: SSE, AVX, AVX-512, AMX for x86/x64, NEON for ARM. (by ermig1979)
-
proton
Fastest SQL pipeline engine in a single C++ binary, for stream processing, analytics, observability and AI. (by timeplus-io)
Project mention: Show HN: Open-Source C++ Apache Iceberg Client with Write Support | news.ycombinator.com | 2025-03-20 -
fast_float
Fast and exact implementation of the C++ from_chars functions for number types: 4x to 10x faster than strtod, part of GCC 12, Chromium, Redis and WebKit/Safari
-
kfr
Fast, modern C++ DSP framework, FFT, Sample Rate Conversion, FIR/IIR/Biquad Filters (SSE, AVX, AVX-512, ARM NEON)
-
-
DirectXMath
DirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps
-
ada
WHATWG-compliant and fast URL parser written in modern C++, part of Internet Archive, Node.js, Clickhouse, Redpanda, Kong, Telegram, Adguard, Datadog and Cloudflare Workers.
-
simdutf
Unicode routines (UTF8, UTF16, UTF32) and Base64: billions of characters per second using SSE2, AVX2, NEON, AVX-512, RISC-V Vector Extension, LoongArch64, POWER. Part of Node.js, WebKit/Safari, Ladybird, Chromium, Cloudflare Workers and Bun.
Project mention: Simdutf: Fast Unicode Validation and Transcoding | news.ycombinator.com | 2025-06-05 -
Project mention: Understanding SIMD: Infinite Complexity of Trivial Problems | news.ycombinator.com | 2024-11-30
I'm surprised no one has mentioned Vc. I found ispc clunky and not as performant, and std::simd didn't support some useful math ops like rsqrt. Vc has been around for years, I have no trouble including it in my codes, it has masking and many of the most useful math ops, and I can get over 1 TF/s on a consumer-grade Ryzen and at least 3 TF/s on the big Epyc CPUs.
https://github.com/VcDevel/Vc
-
-
-
Here is a bunch of simple examples: https://github.com/jfalcou/eve/blob/fb093a0553d25bb8114f1396...
I personally think we have the following strenghs:
* Algorithms. Writing SIMD loops is very hard. We give you a lot of ready to go loops. (find, search, remove, set_intersection to name a few).
-
-
Project mention: Hlslpp: Math library using HLSL syntax with multiplatform SIMD support | news.ycombinator.com | 2025-06-11
-
-
InfluxDB
InfluxDB โ Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
C++ Simd discussion
C++ Simd related posts
-
How to Think About GPUs
-
ISPC: Implicit SPMD Program Compiler
-
Simdutf: Fast Unicode Validation and Transcoding
-
USearch: Similarity Search and Clustering Engine for Vectors and Texts
-
Three Fundamental Flaws of SIMD
-
Show HN: Less Slow C++
-
AWS Graviton 3 > Graviton 4 for Vector Similarity Search
-
A note from our sponsor - Sevalla
sevalla.com | 1 Sep 2025
Index
What are some of the best open-source Simd projects in C++? This list will help you:
# | Project | Stars |
---|---|---|
1 | ncnn | 21,982 |
2 | simdjson | 21,127 |
3 | GLM | 10,180 |
4 | highway | 4,983 |
5 | usearch | 3,063 |
6 | ispc | 2,738 |
7 | ozz-animation | 2,651 |
8 | xsimd | 2,472 |
9 | Simd | 2,196 |
10 | proton | 1,877 |
11 | fast_float | 1,861 |
12 | kfr | 1,760 |
13 | SatDump | 1,732 |
14 | DirectXMath | 1,700 |
15 | ada | 1,581 |
16 | simdutf | 1,500 |
17 | Vc | 1,491 |
18 | sse2neon | 1,421 |
19 | libsimdpp | 1,285 |
20 | eve | 1,241 |
21 | FastNoise2 | 1,202 |
22 | hlslpp | 944 |
23 | Fastor | 803 |