despacer
simde
despacer | simde | |
---|---|---|
2 | 7 | |
147 | 2,171 | |
- | 1.5% | |
5.6 | 9.1 | |
5 months ago | 9 days ago | |
C | C | |
BSD 3-clause "New" or "Revised" License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
despacer
-
Removing characters from strings faster with AVX-512
Cool performance enhancement, with an accompanying implementation in a real-world library (https://github.com/lemire/despacer).
Still, what does it signal that vector extensions are required to get better string performance on x86? Wouldn't it be better if Intel invested their AVX transistor budget into simply making existing REPB prefixes a lot faster?
-
Intel Nukes Alder Lake's AVX-512 Support, Now Fuses It Off in Silicon
If you're looking for an example, perhaps the despacer problem might be one which doesn't get too complex. Do you know of a way to implement it on a GPU such that it'd run better than (or at least as good as) a CPU SIMD implementation would?
simde
-
The Case of the Missing SIMD Code
I was curious about these libraries a few weeks ago and did some searching. Is there one that's got a clearly dominating set of users or contributors?
I don't know what a good way to compare these might be, other than perhaps activity/contributor count.
[1] https://github.com/simd-everywhere/simde
[2] https://github.com/ermig1979/Simd
[3] https://github.com/google/highway
[4] https://gitlab.com/libeigen/eigen
[5] https://github.com/shibatch/sleef
-
Rise: Accelerate the Development of Open Source Software for RISC-V
I note that SIMDe doesn't have RISC-V support yet (but it does support Loongson LoongArch):
https://github.com/simd-everywhere/simde/
There are still a ton of things to do to get the Debian riscv64 port going too:
https://wiki.debian.org/PortsDocs/New
- SIMD intrinsics and the possibility of a standard library solution
-
Portable SIMD library
SIMDe is everything you're after: https://github.com/simd-everywhere/simde
- SIMD Everywhere – SIMD intrinsics on hardware which doesn't support them
-
Making Your Own Tools
> low level code that can run on multiple hardware architectures
I thought SIMD Everywhere was a pretty interesting project for that, lets you write x86 SSE/AVX code and run it on non-x86 architectures:
https://github.com/simd-everywhere/simde
-
Adobe Photoshop Ships on Macs Apple Silicon/M1 – 50% Faster
> architecture-specific features such as SSE/AVX which is not portable.
I don’t have hands-on experience, but somewhere on HN I saw this: https://github.com/simd-everywhere/simde If starting a new cross-platform project today, I would try that library first, before doing the usual intrinsics.
What are some alternatives?
mixbench - A GPU benchmark tool for evaluating GPUs and CPUs on mixed operational intensity kernels (CUDA, OpenCL, HIP, SYCL, OpenMP)
nsimd - Agenium Scale vectorization library for CPUs and GPUs
rust - Empowering everyone to build reliable and efficient software.
sse2neon - A translator from Intel SSE intrinsics to Arm/Aarch64 NEON implementation
cglm - 📽 Highly Optimized 2D / 3D Graphics Math (glm) for C
android-inline-hook - :fire: ShadowHook is an Android inline hook library which supports thumb, arm32 and arm64.
libsimdpp - Portable header-only C++ low level SIMD library
Sparkle - A software update framework for macOS
picoRTOS - Very small, lightning fast, yet portable RTOS with SMP suppport
simdutf - Unicode routines (UTF8, UTF16, UTF32) and Base64: billions of characters per second using SSE2, AVX2, NEON, AVX-512, RISC-V Vector Extension. Part of Node.js and Bun.
darktable - darktable is an open source photography workflow application and raw developer