A Guide to Undefined Behavior in C and C++

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • rust

    Empowering everyone to build reliable and efficient software.

  • > Zig is similar. Yes, they are going to replace LLVM by default, but they're not getting rid of their LLVM backend entirely.

    In the article I linked, they did not say they were replacing LLVM by default, but they did say it would become the default for DEBUG builds due to the faster speed of compilation, to be clear.

    > > Meanwhile, to truly understand Rust, one must be an expert in C and learn the `rustc` code base.

    > Are you under the impression that the "rustc" codebase is written in C/C++? It is not... It uses LLVM, yes, but it's written in Rust.

    I am not under that impression, but I can see how my phrasing leads to that conclusion.

    After reviewing Rust's Bootstrap on Github[0] I can now more precisely state that one's understanding of low-level Rust will be enhanced by knowing C/C++ (for the LLVM portions) as well as Python (for the Rust does not exist on this system downloading of the stage0 binary Cargo and Rust compilers from somewhere else).

    > Cranelift backend which is written in Rust

    When this happens, it seems like it'll be possibly to get the LLVM bits out of the bootstrap process and lead to a fully self-hosted Rust.

    So while you may not personally value that, it seems like some people in the Rust community do.

    [0] https://github.com/rust-lang/rust/tree/master/src/bootstrap

  • JDK

    JDK main-line development https://openjdk.org/projects/jdk

  • code sequence as you can see in its source ([0][1]) instead of simplistic "cdq; idiv $reg": because it does not want trapping behaviour in this particular case; but e.g. AArch64 doesn't trap neither division by zero nor INT_MIN / -1. That's why accurately implementing your language's semantics on different platforms is so annoying and why C standard left itself a nice shortcut.

    [0] https://github.com/openjdk/jdk/blob/d27daf01d6361513a815e783...

    [1] https://github.com/openjdk/jdk/blob/d27daf01d6361513a815e783...

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • rustc_codegen_cranelift

    Cranelift based backend for rustc

  • > When this happens, it seems like it'll be possible to get the LLVM bits out of the bootstrap process and lead to a fully self-hosted Rust.

    What do you mean by "when this happens"? GP's point is that this has already happened: the Cranelift backend is feature-complete from the perspective of the language [0], except for inline assembly and unwinding on panic. It was merged into the upstream compiler in 2020 [1], and a compiler built with only the Cranelift backend is perfectly capable of building another compiler. LLVM hasn't been a necessary component of the Rust compiler for quite some time.

    [0] https://github.com/bjorn3/rustc_codegen_cranelift

    [1] https://github.com/rust-lang/rust/pull/77975

  • tiger-compiler

    Compiler for the Tiger programming language from Andrew W. Appels book, Modern Compiler Implementation in C.

  • I took a compilers class in undergraduate. We read Appel's Modern Compiler Implementation in ML (https://www.cs.princeton.edu/~appel/modern/ml/) and built a compiler for the Tiger language (something like https://github.com/FlexW/tiger-compiler but SML instead of C).

    We covered the main chapters but not the advanced topics. If you look at the TOC (https://www.cs.princeton.edu/~appel/modern/toc.html) you can get an idea of the necessary steps:

    - Lexing

    - Parsing

    - Symbol table generation

    - Type checking

    - IR

    - Some static analyses (e.g. liveness)

    - Some optimizations (e.g. constant folding)

    - Assembly generation

    - Register allocation

    Me and my 1 group-mate had a working compiler at the end. It took a whole semester, sure I was taking other classes and doing other things, but it certainly takes more than 2 weekends.

  • llvm-project

    The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

  • No, LLVM definitely still has big problems. https://github.com/llvm/llvm-project/issues/45725 is an example, the symptom in Rust is that you can write what is in effect a pointer comparison in which LLVM ends up claiming that two things are different, although they are also identical...

  • specification

    Ferrocene Language Specification (by ferrocene)

  • >> The spec does not define the software. The software is as the software does. Having or not having a spec doesn't protect from bugs - people do.

    >> What you're taking about is covering one's ass, not specification.

    They are related.

    In safety-critical software, bugs can cause people to die. Without a spec, no one will use Rust for safety critical software. It would be too risky and no company would accept that level of risk.

    For example if software that controls an airplane is written in Rust and an error occurs during flight, what happens? The software can't just panic and crash or the airplane might crash.

    The Ferrocene project (https://ferrous-systems.com/ferrocene/) is working on producing a safety-critical Rust specification (https://github.com/ferrocene/specification) because having a language specification matters for safety-critical work.

  • CompCert

    The CompCert formally-verified C compiler

  • From my experience, while many MCUs have settled for the big compilers (GCC and Clang), DSPs and some FPGAs (not Intel and Xilinx, those have lately settled for Clang and a combination of Clang and GCC respectively) use some pretty bespoke compilers (just running ./ --version is enough to verify this, if the compiler even offers that option). That's not necessarily bad, since many of them offer some really useful features, but error messages can be really cryptic in some cases. Also some industries require use of verified compilers, like CompCert[1], and in such cases GCC and Clang just don't cut it.

    [1]: https://compcert.org/

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts