-
Then you'll want to look at https://mhdm.dev/posts/sb_lower_bound/#prefetching
100mb is large enough that the branchy version turns out to have a slight advantage, more due to quirks of x86 (speculative execution) rather than being better.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
zig
General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
The Zig stdlib does not call out to C++ for binary search. The binary search is currently here: https://github.com/ziglang/zig/blob/master/lib/std/sort.zig#...
-
Integer overflows cause panics in both debug and release mode.
See line: https://github.com/rust-lang/rust/blob/4d7a80d48697171ed151c...
Panics require bounds checking, but in this case I can imagine the compiler can rewrite to something like ```(left/2) + (right/2) + (right & left & 1)```, ensuring that it will never overflow. This statement would be an obvious target for improvement.
-
Nim
Nim is a statically typed compiled systems programming language. It combines successful concepts from mature languages like Python, Ada and Modula. Its design focuses on efficiency, expressiveness, and elegance (in that order of priority).
> If only there was a clean fast bare-metal language to write all this in..
The author includes a footnotes for "BUT RUST.." and "BUT ZIG..", but how about Nim? Looks like there is a native library implementation of `lowerBound` https://github.com/nim-lang/Nim/blob/version-2-0/lib/pure/al...
-
tigerbeetle
The financial transactions database designed for mission critical safety and performance.
-
I took a stab at the same problem a while ago. Since the upper bound of iterations is based on the input length, if you write your search in a way that extra iterations don't change the result, you can use a switch fallthrough to "unroll" the loop and not have to branch.
https://github.com/ehrmann/branchless-binary-search/blob/mas...
-
Other fast binary searches https://github.com/sslotin/amh-code/tree/main/binsearch
-
> In this case std::lower_bound is very slightly but consistently faster than sb_lower_bound. To always get the best performance it is possible for libraries to use sb_lower_bound whenever directly working on primitive types and std::lower_bound otherwise.
I will say that if this is the case, there are probably much better versions of binary search out there for primitive types. I made one just screwing around with SIMD that's 3x faster than std::lower_bound until becoming memory bound:
https://github.com/matthewkolbe/ThinkingInSimd/tree/main/alg...