Tolower() in Bulk at Speed

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • highway

    Performance-portable, length-agnostic SIMD with runtime dispatch

  • And if not, it may be possible to use unaligned loads/stores to handle the fringe in a single (final) iteration: https://github.com/google/highway#strip-mining-loops

    It is actually feasible to write vector-style code using SIMD instructions. Yes, the SIMD ISA is more complicated because of the various accumulated extensions, but this is what we currently have. And a bit larger code size (for one final loop iteration) doesn't seem to be a big deal.

  • Singeli

    High-level interface for low-level programming

  • Here's an AVX-2 implementation that assumes it can read up to 31 bytes past the end of the input: https://godbolt.org/z/P7PP1MnK7

    Requires -fno-unroll-loops as otherwise clang gets overly unroll-y; the code is fast enough. Tail is dealt with by blending the originally read value with the new one.

    (yes, that's autogenerated; from some https://github.com/mlochbaum/singeli code)

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • rust

    Empowering everyone to build reliable and efficient software.

  • A few weeks ago I optimised[0] the Rust lower/upper case conversion methods to use more SIMD features. In the end, we took a very conservative level of unrolling since we deemed it unlikely that large inputs would need case conversions.

    [0]: https://github.com/rust-lang/rust/pull/97046

  • wuffs

    Wrangling Untrusted File Formats Safely

  • This sort of stuff is where Iterate Loops are good:

    https://github.com/google/wuffs/blob/main/doc/note/iterate-l...

    WUFFS wants this because it demands all the checking at compile time (WUFFS code with a potential buffer overflow just won't compile), so if you need bounds checks you'll be writing them out by hand and the iterate loop often allows you to express a correct solution with no actual checks.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts