Reinforcement learned branchless sorting functions for sort3, sort4 and sort5 were landed in LLVM

This page summarizes the projects mentioned and recommended in the original post on /r/rust

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • glidesort

    A Rust implementation of Glidesort, my stable adaptive quicksort/mergesort hybrid sorting algorithm.

    With the recent release of Glidesort I remembered this change that recently landed in LLVM.

  • sortnetopt

    Lower Size Bounds for Sorting Networks

    I'm intimately familiar with these constructs. These constructs are called sorting networks. Optimal solutions in terms of depth up to 16 elements have been known since 'D. E. Knuth. The art of computer programming, vol. 3: Sorting and Searching, 2nd edition. Addison-Wesley, 1998.', no need for fancy machine learning. Although only recently the 11-12 element ones have had their size proven optimally, actually with Rust code https://github.com/jix/sortnetopt.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • fluxsort

    A fast branchless stable quicksort / mergesort hybrid that is highly adaptive.

    With the right code and code-gen https://github.com/scandum/fluxsort/issues/5 these can be excellent at exploiting wide super-scalar architectures. I use them in both ipn_stable and to a larger extent in ipn_unstable which even uses the lesser known median networks for pivot selection. That said, I've done a lot of experiments with smaller sorting networks, sort3/4/5 as used in libcxx. And I found that they only look good in synthetic micro-benchmarks. If all your program does, is sort inputs of one specific size 3/4/5 in a hot loop and does nothing else, yes they are faster than insertion sort. But as soon as your application does some non-trivial amount of other work before calling sort again, the code complexity and additional branching required to get to that sort network is not worth it anymore. Depending on your architecture, my findings suggest they only start pulling ahead beginning at sizes 8-12.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts