1brc
1brc
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
1brc
-
1 Billion Rows Challenge in PHP
We have already created the file measurements.txt with 1 million lines using the semi-official tool create_measurements.py:
-
1BRC Coding Challenge: Nerd Sniping the Java Community
Looking at the fastest solution, the convertIntoNumber() function is where the magic happens.
Specifically, line 318 - https://github.com/gunnarmorling/1brc/blob/main/src/main/jav...
The line above (long digits ... ) converts from ASCII digits ('0'-'9') to actual numeric digits (range 0-9)
- Why I'm skeptical of rewriting JavaScript tools in "faster" languages
- 20 milhões de linhas em 20s
- Resolvendo o desafio de um bilhão de linhas em Go (de 1m40s para 8,4s)
-
Node vs Bun: One Billion Row Challenge
You can generate the file using a python script from here.
-
The One Billion Row Challenge in CUDA: from 17 minutes to 17 seconds
This would be the code to beat. Ideally with only 8 cores but any number of cores is also very interesting.
https://github.com/gunnarmorling/1brc/discussions/710
-
One Billion Row Challenge in Golang - From 95s to 1.96s
Given that 1-billion-line-file is approximately 13GB, instead of providing a fixed database, the official repository offers a script to generate synthetic data with random readings. Just follow the instructions to create your own database.
-
1BRC Merykitty's Magic SWAR: 8 Lines of Code Explained in 3k Words
Local disk I/O is no longer the bottleneck on modern systems: https://benhoyt.com/writings/io-is-no-longer-the-bottleneck/
In addition, the official 1BRC explicitly evaluated results on a RAM disk to avoid I/O speed entirely: https://github.com/gunnarmorling/1brc?tab=readme-ov-file#eva... "Programs are run from a RAM disk (i.o. the IO overhead for loading the file from disk is not relevant)"
-
Processing One Billion Rows in PHP!
You may have heard of the "The One Billion Row Challenge" (1brc) and in case you don't, go checkout Gunnar Morlings's 1brc repo.
1brc
-
The One Billion Row Challenge in CUDA: from 17 minutes to 17 seconds
There are some good ideas for this type of problem here: https://github.com/dannyvankooten/1brc
After you deal with parsing and hashes, basically you are IO limited so mmap helps. A reasonable guess is that even for the optimal CUDA implementation, because there is no compute to speak of other than a hashmap, the starting of kernels and transfer of data to the GPU would likely add a noticeable bottleneck and make the optimal CUDA code slower than this pure C code.
-
The One Billion Row Challenge in Go: from 1m45s to 4s in nine solutions
c dominates every other language again...https://github.com/dannyvankooten/1brc#submitting
-
The One Billion Row Challenge
You can run the bin/create-sample program from this C implementation here: https://github.com/dannyvankooten/1brc
It’s just the city names + averages from the official repository using a normal distribution to generate 1B random rows.
What are some alternatives?
nodejs - 1️⃣🐝🏎️ The One Billion Row Challenge with Node.js -- A fun exploration of how quickly 1B rows from a text file can be aggregated with different languages.
csvlens - Command line csv viewer
1brc - 1BRC in .NET among fastest on Linux
java - Java bindings for TensorFlow
1brc