-
Ok so both the things you linked are not my ideas. The first one is AFAIK originally from the BlockQuicksort paper https://arxiv.org/pdf/1604.06697.pdf section 3.2. Which Orson Peters used in his pdqsort, which I know base ipnsort on. The second one is from Igor van den Hoven in his work on quadsort as I mention in the function description. Really a lot of this stuff is building on top of other peoples work and refining it. Seemingly I'm the first to write neat little ASCII graphics for them in the code making them easier to approach. There are some novel ideas by me in there too though, for example https://github.com/Voultapher/sort-research-rs/blob/42bf339d... the way I marry a limited number of sorting-networks, insertion sort and the bi-directional merge into one, very fast but binary size and compile time efficient package is novel. Generally I'd say my work has been more about figuring out novel ways to combine existing ideas into one good cohesive package fit for a standard library sort implementation.
-
CodeRabbit
CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
-
Steps to build a fast, highly adaptive AVX-512 sorting algorithm:
- Clone fluxsort (https://github.com/scandum/fluxsort)
- Replace the partitioning code in flux_default_partition and flux_reverse_partition with the obvious AVX-512 version using a compare and two compress instructions
- If you're feeling ambitious, swap out the small array sorting, or incorporate crumsort's fulcrum partition for larger arrays.
I know why I haven't done this: my computer doesn't have AVX-512, and hardly anyone else's seems to. Maybe a couple Zen 4 owners. I'm less clear on why the tech giants are reinventing the wheel to make these sorting alrogithms that don't even handle pre-sorted data rather than working with some of the very high-quality open source stuff out there. Is adaptivity really considered that worthless?
Fluxsort makes this particularly simple because it gets great performance out of a stable out-of-place partition. It's a bit newer; maybe the authors weren't aware of this work. But these algorithms both use (fairly difficult) in-place partitioning code; why not slot that into the well-known pdqsort?
-
If you look at the history of the author's own sorting implementation (ipnsort), it started as an attempt to port fluxsort to Rust:
https://github.com/Voultapher/Presentations/blob/master/rust...
It starts there, eventually the author abandons the linked PR because of a better approach found with ipnsort:
https://github.com/rust-lang/rust/pull/100856#issuecomment-1...
-
If you look at the history of the author's own sorting implementation (ipnsort), it started as an attempt to port fluxsort to Rust:
https://github.com/Voultapher/Presentations/blob/master/rust...
It starts there, eventually the author abandons the linked PR because of a better approach found with ipnsort:
https://github.com/rust-lang/rust/pull/100856#issuecomment-1...
-
It depends on the size of the structs. For struct pointers you're likely better off sorting keys and pointers simultaneously. It doesn't matter much until you get to large sizes (millions), but sorting indices and then selecting with them is random access. If the original ordering is messy, selecting can be slower than the sorting step. For structs a few words long, the unit you're moving is a larger portion of a cache line, and I'd expect the data movement to drown out any SIMD advantage. A radix sort might be all right because it moves less, but I'd probably go with sorting indices as the first thing to try unless I knew the arrays were huge. For very large structs there's an interesting effort called mountain sort[0], "probably the best sorting algorithm if you need to sort actual mountains by height". Given that it minimizes number of moves it's ignoring cache entirely. I haven't benchmarked so I can't say much about how practical it is.
[0] https://github.com/Morwenn/mountain-sort
-
The vqsort README says it is a non-issue on that generation CPU based on their benchmarks: https://github.com/google/highway/tree/master/hwy/contrib/so...
-
https://github.com/scandum/quadsort/blob/f171a0b26cf6bd6f6dc...
As you can see, quadsort 1.1.4.1 used 2 instead of 4 writes in the bi-directional parity merges. This was in June 2021, and would have compiled as branchless with clang, but as branched with gcc.
When I added a compile time check to use ternary operations for clang I was not adapting your work. I was well aware that clang compiled ternary operations as branchless, but I wasn't aware that rust did as well. I added the compile time check to use ternary operations for a fair performance comparison against glidesort.
https://raw.githubusercontent.com/scandum/fluxsort/main/imag...
As for ipnsort's small sort, it is very similar to quadsort's small sort, which uses stable sorting networks, instead of unstable sorting networks. From my perspective it's not exactly novel. I didn't go for unstable sorting networks in crumsort to increase code reuse, and to not reduce adaptivity.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
quadsort/fluxsort/crumsort author here.
For me there's a strong visual component, perhaps most obvious for my work on array rotation algorithms.
https://github.com/scandum/rotate
There's also the ability to notice strange/curious/discordant things, and either connect the dots through trying semi-random things, as well as sudden insights which seem to be partially subconscious.
One of my (many) theories is that I have the ability to use long-term memory in a quasi-similar manner to short-term memory for problem solving. My IQ is in the 120-130 range, I suffer from hypervigilance, so it's generally on the lower end due to lack of sleep.
I'd say there's a strong creative aspect. If I could redo life I might try my hand at music.
Related posts
-
Fluxsort: A stable quicksort, now faster than Timsort for both random and ordered data
-
GitHub - scandum/fluxsort: A branchless stable quicksort / mergesort hybrid.
-
Reinforcement learned branchless sorting functions for sort3, sort4 and sort5 were landed in LLVM
-
Need a Quick Favor: Help a Student with a Simple Click!
-
Doubt regarding counting sort