C++ Avx512

Open-source C++ projects categorized as Avx512
Topics: Simd Avx2 Avx Neon Sse

Top 17 C++ Avx512 Projects

  • simdjson

    Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks

  • Project mention: Tips on adding JSON output to your command line utility. (2021) | news.ycombinator.com | 2024-04-20

    It's also supported by simdjson [0] (which has a lot of language bindings [1]):

    > Multithreaded processing of gigantic Newline-Delimited JSON (ndjson) and related formats at 3.5 GB/s

    [0] https://simdjson.org/

    [0] https://github.com/simdjson/simdjson?tab=readme-ov-file#bind...

  • highway

    Performance-portable, length-agnostic SIMD with runtime dispatch

  • Project mention: Llamafile 0.7 Brings AVX-512 Support: 10x Faster Prompt Eval Times for AMD Zen 4 | news.ycombinator.com | 2024-03-31

    The bf16 dot instruction replaces 6 instructions: https://github.com/google/highway/blob/master/hwy/ops/x86_12...

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • oneDNN

    oneAPI Deep Neural Network Library (oneDNN)

  • Project mention: Blaze: A High Performance C++ Math library | news.ycombinator.com | 2024-04-17

    If you are talking about non-small matrix multiplication in MKL, is now in opensource as a part of oneDNN. It literally has exactly the same code, as in MKL (you can see this by inspecting constants or doing high-precision benchmarks).

    For small matmul there is libxsmm. It may take tremendous efforts make something faster than oneDNN and libxsmm, as jit-based approach of https://github.com/oneapi-src/oneDNN/blob/main/src/gpu/jit/g... is too flexible: if someone finds a better sequence, oneDNN can reuse it without major change of design.

    But MKL is not limited to matmul, I understand it...

  • xsimd

    C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE))

  • Project mention: GDlog: A GPU-Accelerated Deductive Engine | news.ycombinator.com | 2023-12-03

    https://github.com/xtensor-stack/xsimd

    GH topics > HashMap:

  • Simd

    C++ image processing and machine learning library with using of SIMD: SSE, AVX, AVX-512, AMX for x86/x64, VMX(Altivec) and VSX(Power7) for PowerPC, NEON for ARM. (by ermig1979)

  • Project mention: The Case of the Missing SIMD Code | news.ycombinator.com | 2023-06-08

    I was curious about these libraries a few weeks ago and did some searching. Is there one that's got a clearly dominating set of users or contributors?

    I don't know what a good way to compare these might be, other than perhaps activity/contributor count.

    [1] https://github.com/simd-everywhere/simde

    [2] https://github.com/ermig1979/Simd

    [3] https://github.com/google/highway

    [4] https://gitlab.com/libeigen/eigen

    [5] https://github.com/shibatch/sleef

  • StringZilla

    Up to 10x faster strings for C, C++, Python, Rust, and Swift, leveraging SWAR and SIMD on Arm Neon and x86 AVX2 & AVX-512-capable chips to accelerate search, sort, edit distances, alignment scores, etc 🦖

  • Project mention: Measuring energy usage: regular code vs. SIMD code | news.ycombinator.com | 2024-02-19

    The 3.5x energy-efficiency gap between serial and SIMD code becomes even larger when

    A. you do byte-level processing instead of float words;

    B. you use embedded, IoT, and other low-energy devices.

    A few years ago I've compared Nvidia Jetson Xavier (long before the Orin release), Intel-based MacBook Pro with Core i9, and AVX-512 capable CPUs on substring search benchmarks.

    On Xavier one can quite easily disable/enable cores and reconfigure power usage. At peak I got to 4.2 GB/J which was an 8.3x improvement in inefficiency over LibC in substring search operations. The comparison table is still available in the older README: https://github.com/ashvardanian/StringZilla/tree/v2.0.2?tab=...

  • kfr

    Fast, modern C++ DSP framework, FFT, Sample Rate Conversion, FIR/IIR/Biquad Filters (SSE, AVX, AVX-512, ARM NEON)

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • Vc

    SIMD Vector Classes for C++

  • libsimdpp

    Portable header-only C++ low level SIMD library

  • x86-simd-sort

    C++ template library for high performance SIMD based sorting algorithms

  • Project mention: SIMD based custom object and key-value pair sorting in C++ | news.ycombinator.com | 2024-02-14
  • std-simd

    std::experimental::simd for GCC [ISO/IEC TS 19570:2018]

  • Project mention: A proposal for the next version of C [pdf] | news.ycombinator.com | 2024-01-20

    neither proposing nor taking a position on this possible addition)

    > ... For completeness we would also like to add that a serious issue is that C still lacks vector operations.

    Those are good points. The authors don't take a stance on it, but I do think that syntax for packed structs should be standardized. IMO, so should syntax for inline assembly (both as optional features). These are already common extensions; this is exactly what they should standardize. The additions of "typeof" and #embed are also good examples of this (they had been talking about adding #embed since 1995 [1]).

    As for vector instructions, I'm unsure how it could be implemented in a standard way, but I'm not against it. Maybe something like this [2], but with the syntax changed for C instead of C++.

    [1]: https://groups.google.com/g/comp.std.c/c/zWFEXDvyTwM

    [2]: https://github.com/VcDevel/std-simd

  • toys

    Storage for my snippets, toy programs, etc.

  • Project mention: Modern Perfect Hashing for Strings | news.ycombinator.com | 2023-04-30

    I think all of these techniques check whether the input string is correct. For example see here https://github.com/WojciechMula/toys/blob/master/lookup-in-s...

  • sse-popcount

    SIMD (SSE) population count --- http://0x80.pl/articles/sse-popcount.html

  • md5-optimisation

    The fastest MD5 implementation using x86 assembly

  • Project mention: The least interesting part about AVX-512 is the 512 bits vector width | news.ycombinator.com | 2023-06-19

    Very useful. In fact, it speeds up a single instance (i.e. not taking advantage of SIMD) of MD5 by 20%: https://github.com/animetosho/md5-optimisation#x86-avx512-vl...

  • std_find_simd

    std::find simd version

  • VectorizedKernel

    Running GPGPU-like kernels on CPU with auto-vectorization for SSE/AVX/AVX512 SIMD Architectures

  • ThinkingInSimd

    An essay comparing performance implications of ignoring AVX acceleration

  • Project mention: Fastest Branchless Binary Search | news.ycombinator.com | 2023-08-11

    > In this case std::lower_bound is very slightly but consistently faster than sb_lower_bound. To always get the best performance it is possible for libraries to use sb_lower_bound whenever directly working on primitive types and std::lower_bound otherwise.

    I will say that if this is the case, there are probably much better versions of binary search out there for primitive types. I made one just screwing around with SIMD that's 3x faster than std::lower_bound until becoming memory bound:

    https://github.com/matthewkolbe/ThinkingInSimd/tree/main/alg...

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

C++ Avx512 related posts

Index

What are some of the best open-source Avx512 projects in C++? This list will help you:

Project Stars
1 simdjson 18,362
2 highway 3,623
3 oneDNN 3,456
4 xsimd 2,036
5 Simd 1,971
6 StringZilla 1,776
7 kfr 1,582
8 Vc 1,417
9 libsimdpp 1,188
10 x86-simd-sort 794
11 std-simd 544
12 toys 311
13 sse-popcount 309
14 md5-optimisation 96
15 std_find_simd 18
16 VectorizedKernel 7
17 ThinkingInSimd 3

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com