Speed of Rust vs. C

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

rust

2,683 93,041 10.0 Rust

Empowering everyone to build reliable and efficient software.

There is undefined behavior in Rust affecting real-world code, including Tokio's scheduler, and code produced by async fn definitions. UnsafeCell doesn't solve the problem. There's more discussion at https://gist.github.com/Darksonn/1567538f56af1a8038ecc3c664a....
Bug report at https://github.com/rust-lang/rust/issues/63818.
Reddit threads at (older) https://www.reddit.com/r/rust/comments/l4roqk/a_fix_for_the_... and (newer) https://www.reddit.com/r/rust/comments/lxw6cl/update_to_llvm....
Somewhat related HN thread at https://news.ycombinator.com/item?id=26406989.

rustc_codegen_gcc

33 9 9.6 Rust

libgccjit AOT codegen for rustc (by antoyo)

FTR, there are some efforts to integrate GCC & Rust:
https://github.com/antoyo/rustc_codegen_gcc

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
gccrs

102 2,264 10.0

GCC Front-End for Rust
zig

816 30,773 10.0 Zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.

> computed goto
I did a deep dive into this topic lately when exploring whether to add a language feature to zig for this purpose. I found that, although finnicky, LLVM is able to generate the desired machine code if you give it a simple enough while loop continue expression. So I think it's reasonable to not have a computed goto language feature.
More details here, with lots of fun godbolt links: https://github.com/ziglang/zig/issues/8220

libskry_r

2 16 0.0 Rust

Lucky imaging library

To practise Rust, I rewrote my small C99 library in it [1]. Performance is more or less the same, I only had to use unchecked array access in one small hot loop (details in README.md). I haven't ported multithreading yet, but I expect Rust's Rayon parallel iterators will likewise be comparable to OpenMP.
[1] https://github.com/GreatAttractor/libskry_r

min-sized-rust

101 7,448 6.2 Rust

🦀 How to minimize Rust binary size 📦

https://users.rust-lang.org/t/link-the-rust-standard-library... and https://github.com/johnthagen/min-sized-rust
I'd be interested in any up-to-date trick to do better than this.

ixy-languages

30 2,108 0.0 TeX

A high-speed network driver written in C, Rust, C++, Go, C#, Java, OCaml, Haskell, Swift, Javascript, and Python

We've implemented network drivers in C and Rust and did a performance comparison. Interestingly, the C-to-Rust-transpiled code ended up being faster than the original C implementation: https://github.com/ixy-languages/ixy-languages/blob/master/R...

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
ixy

2 1,122 0.0 C

A simple yet fast user space network driver for Intel 10 Gbit/s NICs written from scratch

https://github.com/emmericp/ixy/blob/0e00605be4153b06df06184...
Looks like you're compiling C code with -O2. Does Rust build set -O3 on clang? Did you try -O3 with C? I know it's not guaranteed to be faster, just curious.

fst

11 1,712 3.5 Rust

Represent large sets and maps compactly with finite state transducers.

No you don't. I've written multiple programs that load things instantly off the file system via memory maps. See the fst crate[1], for example, which is designed to work with memory maps.
Rust "works badly with memory mapped files" doesn't mean, "Rust can't use memory mapped files." It means, "it is difficult to reconcile Rust's safety story with memory maps." ripgrep for example uses memory maps because they are faster sometimes, and its safety contract[2] is a bit strained. But it works.
[1] - https://github.com/BurntSushi/fst/
[2] - https://docs.rs/grep-searcher/0.1.7/grep_searcher/struct.Mma...

smartstring

7 482 0.0 Rust

Compact inlined strings for Rust.

I’ve been using smartstrings, which is both excellent and maintained. https://github.com/bodil/smartstring

CPython

1,314 59,658 10.0 Python

The Python programming language

You don't have to retain objects in an internal linked list when the refcount drops to zero, but Python does.
Type-specific free lists:
* https://github.com/python/cpython/blob/master/Objects/floato...
* https://github.com/python/cpython/blob/master/Objects/tupleo...
And just wrapping malloc in general; there's no refcounting reason for this, they just assume system malloc is slow (which might be true, for glibc) and wrap it in the default build configuration:
https://github.com/python/cpython/blob/master/Objects/obmall...
So many layers of wrapping malloc, just because system allocators were slow in 2000. Defeats free() poisoning and ASAN. obmalloc can be disabled by turning off PYMALLOC, but that doesn't disable the per-type freelists IIRC. And PYMALLOC is enabled by default.

ripgrep

348 45,040 9.3 Rust

ripgrep recursively searches directories for a regex pattern while respecting your gitignore

Why would you guess about how much C or Rust code that ripgrep contains when you could very quickly look? https://github.com/BurntSushi/ripgrep

dhall-lang

113 4,137 6.0 Dhall

Maintainable configuration files

> Languages like Idris and Agda are different because sometimes code isn’t executed at all. A proof may depend on knowing that some code will terminate without running it.
Yes. They are rather different in other respects as well. Though you can produce executable code from Idris and Agda, of course.
> With respect to deadlocks, there’s little practical difference between an infinite loop and a loop that holds the lock for a very long time.
Yes, that's true. Though as a practical matter, I have heard that it's much harder to produce the latter by accident, even though only the former is forbidden.
For perhaps a more practical example, have a look at https://dhall-lang.org/ which also terminates, but doesn't have nearly as much involved proving.

redgrep

4 150 5.8 C++

♥ Janusz Brzozowski

It couldn't figure it out from looking through ripgrep's website: does ripgrep support intersection and complement of expressions? Like eg https://github.com/google/redgrep does.
Regular languages are closed under those operations after all.

barre

1 12 0.0 Rust

A Regular Expression Library and CFG parser for Rust using Brzozski Derivatives

I've made some attempts, but nothing production grade.
About large character classes: how are those harder than in approaches? If you build any FSM you have to deal with those, don't you?
One way to handle them that works well when the characters in your classes are mostly next to each other unicode, is to express your state transition function as an 'interval map'
What I mean is that eg a hash table or an array lets you build representations of mathematical functions that map points to values.
You want something that can model a step function.
You can either roll your own, or write something around a sorted-map data structure.
Eg in C++ you'd base the whole thing around https://en.cppreference.com/w/cpp/container/map/upper_bound (or https://hackage.haskell.org/package/containers-0.4.0.0/docs/... in Haskell.)
The keys in your sorted map are the 'edges' of your characters classes (eg where they start and end).
Does that make sense? Or am I misunderstanding the problem?
> I personally always get stuck at how to handle things like captures [...]
Let me think about that one for a while. Some Googling suggests https://github.com/elfsternberg/barre though

regex

91 3,355 8.9 Rust

An implementation of regular expressions for Rust. This implementation uses finite automata and guarantees linear time matching on all inputs.

> About large character classes: how are those harder than in approaches? If you build any FSM you have to deal with those, don't you?
I mean specifically in the context of derivatives. IIRC, the formulation used in Turon's paper wasn't amenable to large classes.
Yes, interval sets work great: https://github.com/rust-lang/regex/blob/master/regex-syntax/...
This is why I asked if a production grade regex engine based on derivatives exists. Because I want to see how the engineering is actually done.
> What do you want your capture groups to do? Do you eg just want to return pointers to where you captured them (if any)?
Look at any production grade regex engine. It will implement captures. It should do what they do.
> I have an inkling that something inspired by https://en.wikipedia.org/wiki/Viterbi_algorithm might work.
Nothing about Viterbi is fast, in my experience implementing it in the past. :-)
> https://github.com/google/redgrep/blob/main/parser.yy mentions something about capture, but not sure if that has anything to do with capture groups.
It looks like it does, and in particular see: https://github.com/google/redgrep/blob/6b9d5b02753c4ece17e2f...
But that's only for parsing the regex itself. I don't see any match APIs that utilize them. I wouldn't expect to either, because you can't implement capturing inside a DFA. (You need a tagged DFA, which is a strictly more powerful thing. But in that case, the DFA size explodes. See the re2c project and their associated papers.)
If I'm remembering correctly, I think the problem with derivatives is that they jump straight to a DFA. You can't do that a production regex engine because a DFA's worst case size is exponential in the size of the regex.

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Common Rust Lifetime Misconceptions

4 projects | news.ycombinator.com | 4 Dec 2023
Contention on multi-threaded regex matching

3 projects | /r/rust | 22 Oct 2022
Looking for recommendations of well maintained open source rust codebases that I can look through/contribute to

5 projects | /r/rust | 13 Dec 2021
Rust Moderation Team Resigns

17 projects | news.ycombinator.com | 22 Nov 2021
Rust 1.51.0 can't be built on 32-bit ARM any more

3 projects | /r/rust | 4 May 2021

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Rust Compiler Text processing Regex Applications written in Rust
Post date: 12 Mar 2021

rust

rustc_codegen_gcc

InfluxDB

gccrs

zig

libskry_r

min-sized-rust

ixy-languages

SaaSHub

ixy

fst

smartstring

CPython

ripgrep

dhall-lang

redgrep

barre

regex

SaaSHub

Related posts

Common Rust Lifetime Misconceptions

Contention on multi-threaded regex matching

Looking for recommendations of well maintained open source rust codebases that I can look through/contribute to

Rust Moderation Team Resigns

Rust 1.51.0 can't be built on 32-bit ARM any more