sb_lower_bound
ThinkingInSimd
sb_lower_bound | ThinkingInSimd | |
---|---|---|
8 | 1 | |
14 | 3 | |
- | - | |
3.9 | 2.3 | |
10 months ago | about 1 year ago | |
C++ | C++ | |
- | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
sb_lower_bound
-
Fastest Branchless Binary Search
Then you'll want to look at https://mhdm.dev/posts/sb_lower_bound/#prefetching
100mb is large enough that the branchy version turns out to have a slight advantage, more due to quirks of x86 (speculative execution) rather than being better.
"very similar topic" is an understatement. Funnily enough the "implementation to perform the best on Apple M1 after all micro-optimizations are applied" in the Conclusion is equivalent in terms of the how many actual comparisons are made as with sb_lower_bound. Out of curiosity I've benchmarked the two and orlp lower_bound seems to perform slightly worse: ~39ns average (using gcc) vs ~33ns average of sb_lower_bound (using clang -cmov). I'm comparing best runs for both, usual disclaimer of tested on my machine.
ThinkingInSimd
-
Fastest Branchless Binary Search
> In this case std::lower_bound is very slightly but consistently faster than sb_lower_bound. To always get the best performance it is possible for libraries to use sb_lower_bound whenever directly working on primitive types and std::lower_bound otherwise.
I will say that if this is the case, there are probably much better versions of binary search out there for primitive types. I made one just screwing around with SIMD that's 3x faster than std::lower_bound until becoming memory bound:
https://github.com/matthewkolbe/ThinkingInSimd/tree/main/alg...
What are some alternatives?
tigerbeetle - The distributed financial transactions database designed for mission critical safety and performance.
rust - Empowering everyone to build reliable and efficient software.
zig - General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
amh-code - Complete implementations from "Algorithms for Modern Hardware"
optimization-manual - Contains the source code examples described in the "IntelĀ® 64 and IA-32 Architectures Optimization Reference Manual"
Nim - Nim is a statically typed compiled systems programming language. It combines successful concepts from mature languages like Python, Ada and Modula. Its design focuses on efficiency, expressiveness, and elegance (in that order of priority).
branchless-binary-search - Binary search implementation that avoids branch instructions