The One Billion Row Challenge in Go: from 1m45s to 4s in nine solutions

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • 1brc

    1️⃣🐝🏎️ The One Billion Row Challenge -- A fun exploration of how quickly 1B rows from a text file can be aggregated with Java

  • There are a few rust solutions in the "Show and Tell" linked above, for example this fairly readable one at 15.5s: https://github.com/gunnarmorling/1brc/discussions/57

    A comment above referencing Python "polars" actually has rust polars, std, and SIMD solutions as well (SIMD was fasted, but less readable for a hobbyist like me).

  • nodejs

    1️⃣🐝🏎️ The One Billion Row Challenge with Node.js -- A fun exploration of how quickly 1B rows from a text file can be aggregated with different languages.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • brc

    1 billion rows challenge

  • I did it with custom parsing[0] and treated the numbers as 16 bit integers, the representation in the file is not a constant number of bytes which complicates the table approach. If you end up computing a hash I think it might be slower than just doing the equivalent parsing I do and a four byte constant table will be very large and mostly empty. Maybe a a trie would be good.

    0: https://github.com/k0nserv/brc/blob/main/src/main.rs#L279

  • 1BillionRowChallenge

    I saw this [Blog Post](https://www.morling.dev/blog/one-billion-row-challenge/) on a Billion Row challenge for Java so naturally I tried implementing a solution in Python & Rust using mainly polars

  • I was curious how long it would take with Polars (for scale), apparently 33s: https://github.com/Butch78/1BillionRowChallenge/tree/main

  • 1brc

    1BRC in .NET among fastest on Linux (by buybackoff)

  • The more accurate statement would be is Go incapable of optimizations performed by Java and then Java is incapable of optimizations performed by C# and C++ implementations.

    See https://hotforknowledge.com/2024/01/13/1brc-in-dotnet-among-...

  • mmap-go

    A portable mmap package for Go

  • Well, I guess it's more that the standard library doesn't have a cross-platform way to access them, not that memory-mapped files themselves can't be done on (say) Windows. It looks like there's a fairly popular 3rd party package that supports at least Linux, macOS, and Windows: https://github.com/edsrzf/mmap-go

  • plb2

    A programming language benchmark

  • https://github.com/attractivechaos/plb2/blob/master/README.m...

    Synthetic benchmarks aside, I think as far as average (spring boots of the world) code goes, Go beats Java almost every time, often in less lines than the usual pom.xml

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • bitcoin_ancestries

    This codebase will produce some stats on the ancestry of each transaction of the Bitcoin network.

  • I thought this was an illustrative example of how to process big datasets. We could easily have a statistic per e.g. bitcoin address in a different problem, see https://github.com/afiodorov/bitcoin_ancestries .

    I struggle a lot with this toy problem. Without constraints too trivial to pay attention to; then no one seems to agree on potential real-world constraints.

  • 1brc

    C99 implementation of the 1 Billion Rows Challenge. 1️⃣🐝🏎️ Runs in ~1.6 seconds on my not-so-fast laptop CPU w/ 16GB RAM. (by dannyvankooten)

  • c dominates every other language again...https://github.com/dannyvankooten/1brc#submitting

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts