The One Billion Row Challenge in CUDA: from 17 minutes to 17 seconds

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

1brc

28 5,136 9.8 Java

1️⃣🐝🏎️ The One Billion Row Challenge -- A fun exploration of how quickly 1B rows from a text file can be aggregated with Java

1brc had SMT disabled [0]. (hyperthreading is an intel marketing name and trademark for their brand of smt, but the benchmark was run on an amd cpu)
[0] https://github.com/gunnarmorling/1brc/issues/189#issuecommen...

xxHash

28 8,462 8.4 C

Extremely fast non-cryptographic hash algorithm

> GPU Hash Table?
How bad would performance have suffered if you sha256'd the lines to build the map? I'm going to guess "badly"?
Maybe something like this in CUDA: https://github.com/Cyan4973/xxHash ?

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
1brc

5 60 7.5 C

C99 implementation of the 1 Billion Rows Challenge. 1️⃣🐝🏎️ Runs in ~1.6 seconds on my not-so-fast laptop CPU w/ 16GB RAM. (by dannyvankooten)

There are some good ideas for this type of problem here: https://github.com/dannyvankooten/1brc
After you deal with parsing and hashes, basically you are IO limited so mmap helps. A reasonable guess is that even for the optimal CUDA implementation, because there is no compute to speak of other than a hashmap, the starting of kernels and transfer of data to the GPU would likely add a noticeable bottleneck and make the optimal CUDA code slower than this pure C code.

parallel-hashmap

31 2,316 7.8 C++

A family of header-only, very fast and memory-friendly hashmap and btree containers.

Standard library maps/unordered_maps are themselves notoriously slow anyway. A sparse_hash_map from abseil or parallel-hashmaps[1] would be better.
[1] https://github.com/greg7mdp/parallel-hashmap

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project