libsimdpp VS sse-popcount

Compare libsimdpp vs sse-popcount and see what are their differences.

libsimdpp

Portable header-only C++ low level SIMD library (by p12tic)

sse-popcount

SIMD (SSE) population count --- http://0x80.pl/articles/sse-popcount.html (by WojciechMula)
Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
libsimdpp sse-popcount
1 2
1,189 311
- -
0.0 5.6
4 months ago 29 days ago
C++ C++
Boost Software License 1.0 BSD 2-clause "Simplified" License
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

libsimdpp

Posts with mentions or reviews of libsimdpp. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-04-04.

sse-popcount

Posts with mentions or reviews of sse-popcount. We have used some of these posts to build our list of alternatives and similar projects.
  • Fast bitset decoding using Intel AVX-512
    1 project | news.ycombinator.com | 11 May 2022
    https://developer.arm.com/documentation/ddi0596/2020-12/SIMD...

    I believe it does 128 bits per instruction, but I'm still struggling with rust w/ asm.

    Along my journeys, however, I found this repo https://github.com/WojciechMula/sse-popcount/ which has tons of competing simd implementations for both intel and arm.

  • Counting set bits in an interesting way
    1 project | news.ycombinator.com | 30 Apr 2022
    The builtin POPCNT that came with Intel's SSE4 (SSE4a for AMD) is much faster. However, at a certain point, using AVX2 (and AVX-512 if present) is actually faster yet [1] - at least for 512 byte inputs or larger.

    [1]: https://github.com/WojciechMula/sse-popcount

What are some alternatives?

When comparing libsimdpp and sse-popcount you can also consider the following projects:

xsimd - C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE))

toys - Storage for my snippets, toy programs, etc.

simde - Implementations of SIMD instruction sets for systems which don't natively support them.

highway - Performance-portable, length-agnostic SIMD with runtime dispatch

VectorizedKernel - Running GPGPU-like kernels on CPU with auto-vectorization for SSE/AVX/AVX512 SIMD Architectures

Vc - SIMD Vector Classes for C++

simdjson - Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks

std-simd - std::experimental::simd for GCC [ISO/IEC TS 19570:2018]

oneDNN - oneAPI Deep Neural Network Library (oneDNN)

pure_simd - A simple, extensible, portable, efficient and header-only SIMD library!

Simd - C++ image processing and machine learning library with using of SIMD: SSE, AVX, AVX-512, AMX for x86/x64, VMX(Altivec) and VSX(Power7) for PowerPC, NEON for ARM.