plb2
1brc
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
plb2
-
Byte-Sized Swift: Building Tiny Games for the Playdate
https://github.com/attractivechaos/plb2 - limited but broad comparison across a large number of languages. Swift and Nim both compare favourably to C.
-
The One Billion Row Challenge in Go: from 1m45s to 4s in nine solutions
https://github.com/attractivechaos/plb2/blob/master/README.m...
Synthetic benchmarks aside, I think as far as average (spring boots of the world) code goes, Go beats Java almost every time, often in less lines than the usual pom.xml
-
Python 3.13 Gets a JIT
I wouldn't be so enthusiastic. Look at other languages that have JIT now: Ruby and PHP. After years of efforts, they are still an order of magnitude slower than V8 and even PyPy [1]. It seems to me that you need to design a JIT implementation from ground up to get good performance โ V8, Dart, LuaJIT and PyPy are like this; if you start with a pure interpreter, it may be difficult to speed it up later.
[1] https://github.com/attractivechaos/plb2
-
Benchmarking 20 programming languages on N-queens and matrix multiplication
A curious thing about Swift: after https://github.com/attractivechaos/plb2/pull/23, the matrix multiplication example is comparable to C and Rust. However, I donโt see a way to idiomatically optimise the sudoku example, whose main overhead is allocating several arrays each time solve() is called. Apparently, in Swift there is no such thing as static array allocation. Thatโs very unfortunate.
1brc
-
The One Billion Row Challenge in CUDA: from 17 minutes to 17 seconds
There are some good ideas for this type of problem here: https://github.com/dannyvankooten/1brc
After you deal with parsing and hashes, basically you are IO limited so mmap helps. A reasonable guess is that even for the optimal CUDA implementation, because there is no compute to speak of other than a hashmap, the starting of kernels and transfer of data to the GPU would likely add a noticeable bottleneck and make the optimal CUDA code slower than this pure C code.
-
The One Billion Row Challenge in Go: from 1m45s to 4s in nine solutions
c dominates every other language again...https://github.com/dannyvankooten/1brc#submitting
-
The One Billion Row Challenge
You can run the bin/create-sample program from this C implementation here: https://github.com/dannyvankooten/1brc
Itโs just the city names + averages from the official repository using a normal distribution to generate 1B random rows.
What are some alternatives?
c-examples - Example C code
1brc - 1๏ธโฃ๐๐๏ธ The One Billion Row Challenge -- A fun exploration of how quickly 1B rows from a text file can be aggregated with Java
laser - The HPC toolbox: fused matrix multiplication, convolution, data-parallel strided tensor primitives, OpenMP facilities, SIMD, JIT Assembler, CPU detection, state-of-the-art vectorized BLAS for floats and integers
nodejs - 1๏ธโฃ๐๐๏ธ The One Billion Row Challenge with Node.js -- A fun exploration of how quickly 1B rows from a text file can be aggregated with different languages.
weave - A state-of-the-art multithreading runtime: message-passing based, fast, scalable, ultra-low overhead
JDK - JDK main-line development https://openjdk.org/projects/jdk
tarantool - Get your data in RAM. Get compute close to data. Enjoy the performance.
1brc - 1BRC in .NET among fastest on Linux
blis - BLAS-like Library Instantiation Software Framework
related_post_gen - Data Processing benchmark featuring Rust, Go, Swift, Zig, Julia etc.
BenchmarkDotNet - Powerful .NET library for benchmarking