quadsort
rotate
quadsort | rotate | |
---|---|---|
9 | 4 | |
2,106 | 143 | |
- | - | |
4.6 | 10.0 | |
6 months ago | over 1 year ago | |
C | C | |
The Unlicense | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
quadsort
-
10~17x faster than what? A performance analysis of Intel x86-SIMD-sort (AVX-512)
https://github.com/scandum/quadsort/blob/f171a0b26cf6bd6f6dc...
As you can see, quadsort 1.1.4.1 used 2 instead of 4 writes in the bi-directional parity merges. This was in June 2021, and would have compiled as branchless with clang, but as branched with gcc.
When I added a compile time check to use ternary operations for clang I was not adapting your work. I was well aware that clang compiled ternary operations as branchless, but I wasn't aware that rust did as well. I added the compile time check to use ternary operations for a fair performance comparison against glidesort.
https://raw.githubusercontent.com/scandum/fluxsort/main/imag...
As for ipnsort's small sort, it is very similar to quadsort's small sort, which uses stable sorting networks, instead of unstable sorting networks. From my perspective it's not exactly novel. I didn't go for unstable sorting networks in crumsort to increase code reuse, and to not reduce adaptivity.
-
Show HN: QuadSort, Esoteric Fast Sort
In the code it looks like the seed to the benchmark can be provided as the 4th command line argument: https://github.com/scandum/quadsort/blob/master/src/bench.c#...
-
When does big-oh notation become not helpful when comparing algorithms?
If you look at sorting for example, it's been proven that you can't do a comparison-based sort faster than O(n logn). You may then think that we've already found the fastest possible sorting algorithms since Quicksort and Mergesort are already O(n logn). However, new sorting algorithms keep being invented, for example Quadsort. They're all still O(n logn), but they do offer a considerable speed improvement over more traditional algorithms
- quadsort 1.1.5.1: Up to 2.5x faster than qsort() on random data
- Quadsort 1.1.5.1: Introducing cost effective branchless merging
- I tried creating a sorting algorithm in C language.
rotate
-
10~17x faster than what? A performance analysis of Intel x86-SIMD-sort (AVX-512)
quadsort/fluxsort/crumsort author here.
For me there's a strong visual component, perhaps most obvious for my work on array rotation algorithms.
https://github.com/scandum/rotate
There's also the ability to notice strange/curious/discordant things, and either connect the dots through trying semi-random things, as well as sudden insights which seem to be partially subconscious.
One of my (many) theories is that I have the ability to use long-term memory in a quasi-similar manner to short-term memory for problem solving. My IQ is in the 120-130 range, I suffer from hypervigilance, so it's generally on the lower end due to lack of sleep.
I'd say there's a strong creative aspect. If I could redo life I might try my hand at music.
-
Is there a more efficient way to write this C program?
This is essentially just a rotation of a subrange of your original array. A variety of different algorithms for this operation can be found here.
-
Building the Perfect Memory Bandwidth Beast
Memory bandwidth is 1000x lower than CPU bandwidth, so as a rule of thumb any algorithm whose work scales linearly in the amount of data being processed will be memory bandwidth bound, and also any algorithm which can't be structured to do a lot of work on one memory region at once before moving onto the next one.
Examples (for large enough inputs that it's relevant) include shuffling, sorting, kmeans clustering, branch and bound sudoku solving, vector addition, dot products, and so on.
Moreover, writing a particular piece of code is often easier if you ignore memory bandwidth as a constraint. The classic example is matrix multiplication -- it can be structured such that even disk bandwidth isn't relevant compared to CPU bandwidth, but doing so is a little fiddly compared to the naive n^2 dot products approach, so writing it yourself usually results in a memory bandwidth bound solution for large matrices.
Similarly, writing two passes over your data rather than doing a mega-loop, the choice to use classic kmeans rather than one of its approximations (when it would be appropriate to do so), or not enforcing sortedness at some reasonable boundary and having to do additional passes over your data. It's easy to write code that hoovers up way more bandwidth than it needs to, and often faster algorithms that come out don't do anything different than access the right data at the right time to reduce that pressure, like a trinity rotation [0].
Caveat: Benchmark everything, especially as you're building intuition. Trying to fix what you think is a memory bandwidth issue can result in pipeline stalls and all sorts of fun things, especially when your server has more faster caches than your dev machine, when data in prod doesn't match your micro benchmark, ....
[0] https://github.com/scandum/rotate
- A collection of array rotation algorithms
What are some alternatives?
pdqsort - Pattern-defeating quicksort.
stb - stb single-file public domain libraries for C/C++
fluxsort - A fast branchless stable quicksort / mergesort hybrid that is highly adaptive.
sort-research-rs - Test and benchmark suite for sort implementations.
blitsort - Blitsort is an in-place stable adaptive rotate mergesort / quicksort.
mountain-sort - The best algorithm to sort mountains
Klib - A standalone and lightweight C library
buddy_alloc - A single header buddy memory allocator for C & C++
sort-test - A simple sort benchmarking tool
microui - A tiny immediate-mode UI library
fastrange - A fast alternative to the modulo reduction
Presentations - Collection of personal presentations