Avx2

Open-source projects categorized as Avx2

Top 23 Avx2 Open-Source Projects

  • simdjson

    Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks

    Project mention: 1BRC Merykitty's Magic SWAR: 8 Lines of Code Explained in 3k Words | news.ycombinator.com | 2024-03-09
  • highway

    Performance-portable, length-agnostic SIMD with runtime dispatch

    Project mention: JPEG XL and the Pareto Front | news.ycombinator.com | 2024-03-01

    [0] for those interested in Highway.

    It's also mentioned in [1], which starts off

    > Today we're sharing open source code that can sort arrays of numbers about ten times as fast as the C++ std::sort, and outperforms state of the art architecture-specific algorithms, while being portable across all modern CPU architectures. Below we discuss how we achieved this.

    [0] https://github.com/google/highway

    [1] https://opensource.googleblog.com/2022/06/Vectorized%20and%2..., which has an associated paper at https://arxiv.org/pdf/2205.05982.pdf.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • CTranslate2

    Fast inference engine for Transformer models

    Project mention: Distil-Whisper: distilled version of Whisper that is 6 times faster, 49% smaller | news.ycombinator.com | 2023-10-31

    Just a point of clarification - faster-whisper references it but ctranslate2[0] is what's really doing the magic here.

    Ctranslate2 is a sleeper powerhouse project that enables a lot. They should be up front and center and get the credit they deserve.

    [0] - https://github.com/OpenNMT/CTranslate2

  • simde

    Implementations of SIMD instruction sets for systems which don't natively support them.

    Project mention: The Case of the Missing SIMD Code | news.ycombinator.com | 2023-06-08

    I was curious about these libraries a few weeks ago and did some searching. Is there one that's got a clearly dominating set of users or contributors?

    I don't know what a good way to compare these might be, other than perhaps activity/contributor count.

    [1] https://github.com/simd-everywhere/simde

    [2] https://github.com/ermig1979/Simd

    [3] https://github.com/google/highway

    [4] https://gitlab.com/libeigen/eigen

    [5] https://github.com/shibatch/sleef

  • StringZilla

    Up to 10x faster strings for C, C++, Python, Rust, and Swift, leveraging SWAR and SIMD on Arm Neon and x86 AVX2 & AVX-512-capable chips to accelerate search, sort, edit distances, alignment scores, etc 🦖

    Project mention: Measuring energy usage: regular code vs. SIMD code | news.ycombinator.com | 2024-02-19

    The 3.5x energy-efficiency gap between serial and SIMD code becomes even larger when

    A. you do byte-level processing instead of float words;

    B. you use embedded, IoT, and other low-energy devices.

    A few years ago I've compared Nvidia Jetson Xavier (long before the Orin release), Intel-based MacBook Pro with Core i9, and AVX-512 capable CPUs on substring search benchmarks.

    On Xavier one can quite easily disable/enable cores and reconfigure power usage. At peak I got to 4.2 GB/J which was an 8.3x improvement in inefficiency over LibC in substring search operations. The comparison table is still available in the older README: https://github.com/ashvardanian/StringZilla/tree/v2.0.2?tab=...

  • DirectXMath

    DirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps

    Project mention: Vector math library benchmarks (C++) | /r/GraphicsProgramming | 2023-04-15

    For those unfamiliar, like I was, DXM is DirectXMath.

  • CRoaring

    Roaring bitmaps in C (and C++), with SIMD (AVX2, AVX-512 and NEON) optimizations: used by Apache Doris, ClickHouse, and StarRocks

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • Vc

    SIMD Vector Classes for C++

  • libsimdpp

    Portable header-only C++ low level SIMD library

  • simdutf

    Unicode routines (UTF8, UTF16, UTF32): billions of characters per second using SSE2, AVX2, NEON, AVX-512, RISC-V Vector Extension. Part of Node.js and Bun.

    Project mention: Vectorizing Unicode conversions on real RISC-V hardware | news.ycombinator.com | 2024-01-27

    The project was mostly inspired by simdutf [0] which has been around for a couple of years already, and I don't think iconv has any of its vectorized implementations for other architectures.

    [0] https://github.com/simdutf/simdutf

  • highwayhash

    Native Go version of HighwayHash with optimized assembly implementations on Intel and ARM. Able to process over 10 GB/sec on a single core on Intel CPUs - https://en.wikipedia.org/wiki/HighwayHash (by minio)

    Project mention: Can I concatenate multiple non-crypto hash functions to reduce collision? | /r/golang | 2023-05-16

    highwayhash (alt) provides 256 bits. Fast mainly for larger inputs and supports seeds.

  • eve

    Expressive Vector Engine - SIMD in C++ Goes Brrrr (by jfalcou)

    Project mention: Lack of modern scientific libraries written in C | /r/C_Programming | 2023-04-06

    C++ offers tools for writing better APIs, and since the addition of concepts in C++20 it offers much better API enforcement. Writing an equivalent to libraries such as {fmt} or EVE is not possible in anything we’d call C.

  • x86-simd-sort

    C++ template library for high performance SIMD based sorting algorithms

    Project mention: SIMD based custom object and key-value pair sorting in C++ | news.ycombinator.com | 2024-02-14
  • TurboPFor

    Fastest Integer Compression

    Project mention: Show HN: Time Series Benchmark TurboPFor,TurboFloat,TurboFloat LzX,TurboGorilla | news.ycombinator.com | 2023-06-25
  • SimSIMD

    Up to 200x Faster Inner Products and Vector Similarity — for Python, JavaScript, Rust, and C, supporting f64, f32, f16 real & complex, i8, and binary vectors using SIMD for both x86 AVX2 & AVX-512 and Arm NEON & SVE 📐

    Project mention: Deep Learning in JavaScript | news.ycombinator.com | 2024-03-28
  • simdutf8

    SIMD-accelerated UTF-8 validation for Rust.

  • fastbase64

    SIMD-accelerated base64 codecs

    Project mention: Designing a SIMD Algorithm from Scratch | news.ycombinator.com | 2023-11-28

    How does this compare to fastbase64[0]? Great article, I'm happy to see this sort of thing online. I wish I could share the author's optimism about portable SIMD libraries.

    [0]: https://github.com/lemire/fastbase64

  • Thorium-Win-AVX2

    Repo to serve AVX2 Windows builds of Thorium. https://github.com/Alex313031/Thorium/

    Project mention: Thorium – Radioactive Chromium Fork | news.ycombinator.com | 2024-01-03

    FYI a number of streaming sites won't work - while this has Widevine, it does not have Verified Media Path (VMP) which verifies that you're running a signed binary. https://github.com/Alex313031/Thorium-Win-AVX2/issues/84#iss...

    https://github.com/castlabs/electron-releases is an interesting Electron fork with full Widevine+VMP support - but it's very much closed-source.

  • nsimd

    Agenium Scale vectorization library for CPUs and GPUs

  • toys

    Storage for my snippets, toy programs, etc.

    Project mention: Modern Perfect Hashing for Strings | news.ycombinator.com | 2023-04-30

    I think all of these techniques check whether the input string is correct. For example see here https://github.com/WojciechMula/toys/blob/master/lookup-in-s...

  • sse-popcount

    SIMD (SSE) population count --- http://0x80.pl/articles/sse-popcount.html

  • TurboRLE

    TurboRLE-Fastest Run Length Encoding

  • Turbo-Base64

    Turbo Base64 - Fastest Base64 SIMD:SSE/AVX2/AVX512/Neon/Altivec - Faster than memcpy!

    Project mention: Show HN: The fastest Turbo-Base64 now for Python | news.ycombinator.com | 2023-08-24

    ** Cython bindings for Turbo Base64 [1] **

    - 20-30x faster than the standard library

    - Benchmarks faster than any other C base64 library

    - Fastest implementation of AVX, AVX2, and AVX512 base64 encoding

    - No other dependencies

    [1] - https://github.com/powturbo/Turbo-Base64

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2024-03-28.

Avx2 related posts

Index

What are some of the best open-source Avx2 projects? This list will help you:

Project Stars
1 simdjson 18,275
2 highway 3,559
3 CTranslate2 2,667
4 simde 2,127
5 StringZilla 1,660
6 DirectXMath 1,477
7 CRoaring 1,425
8 Vc 1,405
9 libsimdpp 1,180
10 simdutf 910
11 highwayhash 852
12 eve 833
13 x86-simd-sort 790
14 TurboPFor 736
15 SimSIMD 671
16 simdutf8 505
17 fastbase64 416
18 Thorium-Win-AVX2 347
19 nsimd 310
20 toys 308
21 sse-popcount 303
22 TurboRLE 275
23 Turbo-Base64 248
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com