Sorting with SIMD

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

zort

2 50 4.9 Zig

Sorting algorithms in zig

Please do comment in situations like this.
It's tough to find a good overview of this topic that includes recent work or that is aimed at empirical performance. That's part of why TFA is great!
A version of MSB radix sort for sorting ints, doubles, or floats, or sorting structs by a single key of those types is here[0] and a version for sorting strings is here[1]. These are based on the MSB radix sort from this repository[2] which is associated with a technical report about parallel string sorts. I was only interested in sequential sorts, but this repository turned out to be a great resource anyway.
[0]: https://github.com/alichraghi/zort/blob/main/src/radix.zig
[1]: https://github.com/dendibakh/perf-challenge6/blob/Solution_R...
[2]: https://github.com/bingmann/parallel-string-sorting

perf-challenge6

1 26 2.2 Python

Please do comment in situations like this.
It's tough to find a good overview of this topic that includes recent work or that is aimed at empirical performance. That's part of why TFA is great!
A version of MSB radix sort for sorting ints, doubles, or floats, or sorting structs by a single key of those types is here[0] and a version for sorting strings is here[1]. These are based on the MSB radix sort from this repository[2] which is associated with a technical report about parallel string sorts. I was only interested in sequential sorts, but this repository turned out to be a great resource anyway.
[0]: https://github.com/alichraghi/zort/blob/main/src/radix.zig
[1]: https://github.com/dendibakh/perf-challenge6/blob/Solution_R...
[2]: https://github.com/bingmann/parallel-string-sorting

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
parallel-string-sorting

1 32 2.0 C++

Collection of Parallel String Sorting Algorithms including Parallel Super Scalar String Sample Sort and Parallel Multiway LCP-Mergesort

Please do comment in situations like this.
It's tough to find a good overview of this topic that includes recent work or that is aimed at empirical performance. That's part of why TFA is great!
A version of MSB radix sort for sorting ints, doubles, or floats, or sorting structs by a single key of those types is here[0] and a version for sorting strings is here[1]. These are based on the MSB radix sort from this repository[2] which is associated with a technical report about parallel string sorts. I was only interested in sequential sorts, but this repository turned out to be a great resource anyway.
[0]: https://github.com/alichraghi/zort/blob/main/src/radix.zig
[1]: https://github.com/dendibakh/perf-challenge6/blob/Solution_R...
[2]: https://github.com/bingmann/parallel-string-sorting

short-simd-sorter

1 2 10.0 C++

I played a bit with finding a sequence of instructions to sort 8 floats in 2 SSE registers with a sorting network:
https://github.com/0xf00ff00f/short-simd-sorter
I pretty much brute-forced it, so I don't know if you could use the same idea to sort 16 values in 2 AVX registers. There's probably a better way.

fTetWild

2 0 10.0 C++

Fast Tetrahedral Meshing in the Wild

I recently tried to do that as well, but failed. Specifically, I have implemented AA sort [1] but for my use case the performance was about the same as std::sort in C++, for the substantial code complexity cost. I reverted to std::sort. The code is on github [2]
Still, in that particular project the vectors being sorted are relatively small, typically under than 100kb, so I have only implemented their “inner” algorithm which works on a single CPU core. The complete AA sort algorithm was apparently designed for large vectors, and uses both SIMD and multithreading. Might be still useful for very long vectors.
[1] https://ieeexplore.ieee.org/document/4336211
[2] https://github.com/Const-me/fTetWild/blob/master/MeshRepair/...

zerovm-samples

1 7 10.0 C

Sample code and libraries built for ZeroVM

I wonder why no one mentions bitonic sort? If you want to do anything in SIMD you better avoid branching as much as possible... and ideally altogether. Here is an implementation I co-authored some 10 years ago: https://github.com/zerovm/zerovm-samples/blob/master/disort/...
Sorting-networks which were already mentioned seems similar but a bit too abstract.
My code above doesn't contains values but those are easy to add I think. Of course it is better to permute fixed size pointers / offsets and not the entire blobs which can be of variable size and then it will complicate everything beyond feasible for SIMD

rune

16 1,910 4.9 C

Rune is a programming language developed to test ideas for improving security and efficiency.

Maybe Google's new "Rune" language will become prevalent https://github.com/google/rune, which supports SoA.

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
highway

66 3,623 9.8 C++

Performance-portable, length-agnostic SIMD with runtime dispatch

We also experimented with 4x4 networks but it's very helpful to use 256 or 512-bit SIMD. You might give our vqsort a try, it's quite a bit faster than std::sort: https://github.com/google/highway/blob/master/hwy/contrib/so...

avx_qsort

2 67 10.0 Assembly

Quick sort code using AVX2 instructions

The original (AFAICT) work on SIMD quick sort, also mentioned in the google post also implemented pointer sort by loading a pointed key using gather instructions and the method can be used for an array of structs. https://github.com/vkrasnov/avx_qsort/blob/master/qsort_AVX2...

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project