simde
streamvbyte
simde | streamvbyte | |
---|---|---|
7 | 2 | |
2,175 | 357 | |
1.7% | - | |
9.1 | 5.5 | |
4 days ago | about 1 month ago | |
C | C | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
simde
-
The Case of the Missing SIMD Code
I was curious about these libraries a few weeks ago and did some searching. Is there one that's got a clearly dominating set of users or contributors?
I don't know what a good way to compare these might be, other than perhaps activity/contributor count.
[1] https://github.com/simd-everywhere/simde
[2] https://github.com/ermig1979/Simd
[3] https://github.com/google/highway
[4] https://gitlab.com/libeigen/eigen
[5] https://github.com/shibatch/sleef
-
Rise: Accelerate the Development of Open Source Software for RISC-V
I note that SIMDe doesn't have RISC-V support yet (but it does support Loongson LoongArch):
https://github.com/simd-everywhere/simde/
There are still a ton of things to do to get the Debian riscv64 port going too:
https://wiki.debian.org/PortsDocs/New
- SIMD intrinsics and the possibility of a standard library solution
-
Portable SIMD library
SIMDe is everything you're after: https://github.com/simd-everywhere/simde
- SIMD Everywhere – SIMD intrinsics on hardware which doesn't support them
-
Making Your Own Tools
> low level code that can run on multiple hardware architectures
I thought SIMD Everywhere was a pretty interesting project for that, lets you write x86 SSE/AVX code and run it on non-x86 architectures:
https://github.com/simd-everywhere/simde
-
Adobe Photoshop Ships on Macs Apple Silicon/M1 – 50% Faster
> architecture-specific features such as SSE/AVX which is not portable.
I don’t have hands-on experience, but somewhere on HN I saw this: https://github.com/simd-everywhere/simde If starting a new cross-platform project today, I would try that library first, before doing the usual intrinsics.
streamvbyte
-
XZ: A Microcosm of the interactions in Open Source projects
Be direct and put the onus on the reporter/contributor to do more work before you will engage.
e.g., here is Daniel Lemire responding to a very open-ended bug report: https://github.com/lemire/streamvbyte/issues/72
There is something similar in customer service for my SaaS. Customers give horribly vague bug reports. I used to try to divine what they wanted. That way leads burnout. Instead, make them do more of the work.
-
Compress-a-Palooza: Unpacking 5 Billion Varints in only 4 Billion CPU Cycles
You're right, I used a lot of unsafe. I started with the implementation from the C source and then my main goal was to add a bounds-check without sacrificing performance. I got there by manually unrolling the inner loop a few times and then bounds checking only once per iteration of the outer loop. So instead of 1 bounds check for every 4 inputs, I have one every 16 or 32 inputs (with a correspondingly more conservative bounds check).
What are some alternatives?
nsimd - Agenium Scale vectorization library for CPUs and GPUs
sleef - SIMD Library for Evaluating Elementary Functions, vectorized libm and DFT
sse2neon - A translator from Intel SSE intrinsics to Arm/Aarch64 NEON implementation
Turbo-Base64 - Turbo Base64 - Fastest Base64 SIMD:SSE/AVX2/AVX512/Neon/Altivec - Faster than memcpy!
android-inline-hook - :fire: ShadowHook is an Android inline hook library which supports thumb, arm32 and arm64.
LittleIntPacker - C library to pack and unpack short arrays of integers as fast as possible
libsimdpp - Portable header-only C++ low level SIMD library
TurboPFor - Fastest Integer Compression
Sparkle - A software update framework for macOS
picoRTOS - Very small, lightning fast, yet portable RTOS with SMP suppport
simdutf - Unicode routines (UTF8, UTF16, UTF32) and Base64: billions of characters per second using SSE2, AVX2, NEON, AVX-512, RISC-V Vector Extension. Part of Node.js and Bun.
darktable - darktable is an open source photography workflow application and raw developer