C++ Simd

Open-source C++ projects categorized as Simd
Neon CPP Sse Avx Avx512

Top 23 C++ Simd Projects

  • ncnn

    ncnn is a high-performance neural network inference framework optimized for the mobile platform

    Project mention: AMD Funded a Drop-In CUDA Implementation Built on ROCm: It's Open-Source | news.ycombinator.com | 2024-02-12

    ncnn uses Vulkan for GPU acceleration, I've seen it used in a few projects to get AMD hardware support.

    https://github.com/Tencent/ncnn

  • InfluxDB

    Purpose built for real-time analytics at any scale. InfluxDB Platform is powered by columnar analytics, optimized for cost-efficient storage, and built with open data standards.

    InfluxDB logo
  • simdjson

    Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks

    Project mention: Wc2: Investigates optimizing 'wc', the Unix word count program | news.ycombinator.com | 2024-06-20

    State machines are great for complex situations, but when it comes to performance, it's not at all clear to me that they're the most scalable approach with modern systems.

    The data dependency between a loop iteration for each character might be pipelined really well when executed, and we can assume large enough L1/L2 cache for our lookup tables. But we're still using at least one lookup per character.

    Projects like https://github.com/simdjson/simdjson?tab=readme-ov-file#abou... are truly fascinating, because they're based on SIMD instructions that can process 64 or more bytes with a single instruction. Very much worth checking out the papers at the link above.

  • GLM

    OpenGL Mathematics (GLM)

    Project mention: Release of GLM 1.0.0 | news.ycombinator.com | 2024-01-24
  • highway

    Performance-portable, length-agnostic SIMD with runtime dispatch

    Project mention: Highway โ€“ Portable SIMD Library | news.ycombinator.com | 2024-08-22
  • ispc

    Intelยฎ Implicit SPMD Program Compiler

    Project mention: Implementing a GPU's Programming Model on a CPU | news.ycombinator.com | 2023-10-14

    This so-called GPU programming model has existed many decades before the appearance of the first GPUs, but at that time the compilers were not so good like the CUDA compilers, so the burden for a programmer was greater.

    As another poster has already mentioned, there exists a compiler for CPUs which has been inspired by CUDA and which has been available for many years: ISPC (Implicit SPMD Program Compiler), at https://github.com/ispc/ispc .

    NVIDIA has the very annoying habit of using a lot of terms that are different from those that have been previously used in computer science for decades. The worst is that NVIDIA has not invented new words, but they have frequently reused words that have been widely used with other meanings.

    SIMT (Single-Instruction Multiple Thread) is not the worst term coined by NVIDIA, but there was no need for yet another acronym. For instance they could have used SPMD (Single Program, Multiple Data Stream), which dates from 1988, two decades before CUDA.

    Moreover, SIMT is the same thing that was called "array of processes" by C.A.R. Hoare in August 1978 (in "Communicating Sequential Processes"), or "replicated parallel" by Occam in 1985 or "PARALLEL DO" by "OpenMP Fortran" in 1997-10 or "parallel for" by "OpenMP C and C++" in 1998-10.

    The only (but extremely important) innovation brought by CUDA is that the compiler is smart enough so that the programmer does not need to know the structure of the processor, i.e. how many cores it has and how many SIMD lanes has each core. The CUDA compiler distributes automatically the work over the available SIMD lanes and available cores and in most cases the programmer does not care whether two executions of the function that must be executed for each data item are done on two different cores or on two different SIMD lanes of the same core.

  • ozz-animation

    Open source c++ skeletal animation library and toolset

  • xsimd

    C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE))

    Project mention: GDlog: A GPU-Accelerated Deductive Engine | news.ycombinator.com | 2023-12-03

    https://github.com/xtensor-stack/xsimd

    GH topics > HashMap:

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • usearch

    Fast Open-Source Search & Clustering engine ร— for Vectors & ๐Ÿ”œ Strings ร— in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram ๐Ÿ”

    Project mention: Usearch: Single-File Similarity Search | news.ycombinator.com | 2024-08-09
  • StringZilla

    Up to 10x faster strings for C, C++, Python, Rust, and Swift, leveraging NEON, AVX2, AVX-512, and SWAR to accelerate search, sort, edit distances, alignment scores, etc ๐Ÿฆ–

    Project mention: I'm Not a Fan of Strlcpy(3) | news.ycombinator.com | 2024-07-15

    Aside from the NULL-termination requirements there is arguably another big design issue with libc strings. I believe the interfaces that may allocate memory - must give you an opportunity to override the allocator. Aside from the SIMD implementation quality and throughput on Arm, that was one of the key reasons to start a new library: https://github.com/ashvardanian/StringZilla/blob/91d0a1a02fa...

    Also not a huge fan of locale controls and wchar APIs :)

  • Simd

    C++ image processing and machine learning library with using of SIMD: SSE, AVX, AVX-512, AMX for x86/x64, VMX(Altivec) and VSX(Power7) for PowerPC, NEON for ARM. (by ermig1979)

  • kfr

    Fast, modern C++ DSP framework, FFT, Sample Rate Conversion, FIR/IIR/Biquad Filters (SSE, AVX, AVX-512, ARM NEON)

  • DirectXMath

    DirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps

  • Vc

    SIMD Vector Classes for C++

  • fast_float

    Fast and exact implementation of the C++ from_chars functions for number types: 4x to 10x faster than strtod, part of GCC 12, Chromium and WebKit/Safari

  • ada

    WHATWG-compliant and fast URL parser written in modern C++

    Project mention: Parsing URLs in Python | news.ycombinator.com | 2024-03-16

    ...

    can_ada is just the python bindings, largely generated via pybind11.

    The actual project is at https://github.com/ada-url/ada

  • SatDump

    A generic satellite data processing software.

  • sse2neon

    A translator from Intel SSE intrinsics to Arm/Aarch64 NEON implementation

  • libsimdpp

    Portable header-only C++ low level SIMD library

  • simdutf

    Unicode routines (UTF8, UTF16, UTF32) and Base64: billions of characters per second using SSE2, AVX2, NEON, AVX-512, RISC-V Vector Extension. Part of Node.js, WebKit/Safari and Bun.

    Project mention: Decoding UTF8 with Parallel Extract | news.ycombinator.com | 2024-05-05

    IIRC all of the simdutf implementations use a lot of lookup tables except for the AVX512 and RVV backens.

    Here is e.g. the NEON code: https://github.com/simdutf/simdutf/blob/1b8ca3d1072a8e2e1026...

  • FastNoise2

    Modular node graph based noise generation library using SIMD, C++17 and templates

  • eve

    Expressive Vector Engine - SIMD in C++ Goes Brrrr (by jfalcou)

  • Fastor

    A lightweight high performance tensor algebra framework for modern C++

  • rtm

    Realtime Math

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

C++ Simd discussion

Log in or Post with

C++ Simd related posts

Index

What are some of the best open-source Simd projects in C++? This list will help you:

Project Stars
1 ncnn 20,054
2 simdjson 19,050
3 GLM 9,072
4 highway 4,085
5 ispc 2,469
6 ozz-animation 2,402
7 xsimd 2,146
8 usearch 2,089
9 StringZilla 2,035
10 Simd 2,034
11 kfr 1,644
12 DirectXMath 1,532
13 Vc 1,444
14 fast_float 1,346
15 ada 1,315
16 SatDump 1,293
17 sse2neon 1,285
18 libsimdpp 1,216
19 simdutf 1,086
20 FastNoise2 981
21 eve 927
22 Fastor 734
23 rtm 707

Sponsored
Purpose built for real-time analytics at any scale.
InfluxDB Platform is powered by columnar analytics, optimized for cost-efficient storage, and built with open data standards.
www.influxdata.com

Did you konow that C++ is
the 6th most popular programming language
based on number of metions?