Boehm Garbage Collector

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • jank

    A Clojure dialect hosted on LLVM with native C++ interop

  • I have been using the Boehm-Demers-Weiser GC for jank [1], the native Clojure dialect on LLVM with C++ interop. Since jank is a Clojure dialect, and Clojure is built primarily on persistent, immutable data structures, there are potentially a lot of references to objects, across many threads, and there's a lot of garbage being churned. I originally started with reference counting and RAII, using boost::intrusive_ptr and an atomic count. The GC was actually 2x faster, in the sequence benchmark I was trying to optimize.

    At this point, jank is generally several times faster than the equivalent Clojure JVM code. I'm sure there are better GCs out there, in terms of performance, and I have my eye on MMTK [2] for a future upgrade, but the fact that remains is this: the Boehm GC is stupid simple to integrate and it's surprisingly fast. Compare it to MPS, MMTK, and others and both the documentation and the actual dev work required are worlds apart.

    For a project which needs a GC but doesn't need to pick the absolute best one first, it seems like the best option, based on my research.

    1: https://jank-lang.org/

    2: https://www.mmtk.io/code

  • V8

    The official mirror of the V8 Git repository

  • https://chromium.googlesource.com/v8/v8.git/+/HEAD/include/c...

    Due to the nature of web engine workloads migrating objects to being GC'd isn't performance negative (as most people would expect). With care it can often end up performance positive.

    There are a few tricks that Oilpan can apply. Concurrent tracing helps a lot (e.g. instead of incrementing/decrementing refs, you can trace on a different thread), in addition when destructing objects, the destructors typically become trivial meaning the object can just be dropped from memory. Both these free up main thread time. (The tradeoff with concurrent tracing is that you need atomic barriers when assigning pointers which needs care).

    This is on top of the safey improvements you gain from being GC'd vs. smart pointers, etc.

    One major tradeoff that UAF bugs become more difficult to fix, as you are just accessing objects which "should" be dead.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • mps

    The Memory Pool System

  • I have a library which has an extremely slow free, around 2m for large files, because of unnaturally scattered allocation patterns, but this old conservative GC didn't help at all. It was about 40% slower with libgc. mimalloc was a bit better. Best would be a properly fast GC, like mps https://github.com/Ravenbrook/mps, but this would be too much work.

  • samsara

    a reference-counting cycle collection library in rust

  • The compiler support you need is quite limited. Here's an implementation of cycle collection in Rust: https://github.com/chc4/samsara It's made possible because Rust can tell apart read-only and read-write references (except for interior mutable objects, but these are known to the compiler and references to them can be treated as read-write). This avoids a global stop-the-world for the entire program.

    Cascading deletes are rare in practice, and if anything they are inherent to deterministic deletion, which is often a desirable property. When they're possible, one can often use arena allocation to avoid the issue altogether, since arenas are managed as a single object.

  • are-we-fast-yet

    Are We Fast Yet? Comparing Language Implementations with Objects, Closures, and Arrays

  • > Sure there's a small overhead to smart pointers

    Not so small, and it has the potential to significantly speed down an application when not used wisely. Here are e.g. some measurements where the programmer used C++11 and did everything with smart pointers: https://github.com/smarr/are-we-fast-yet/issues/80#issuecomm.... There was a speed down between factor 2 and 10 compared with the C++98 implementation. Also remember that smart pointers create memory leaks when used with circular references, and there is an additional memory allocation involved with each smart pointer.

    > Garbage collection has an overhead too of course

    The Boehm GC is surprisingly efficient. See e.g. these measurements: https://github.com/rochus-keller/Oberon/blob/master/testcase.... The same benchmark suite as above is compared with different versions of Mono (using the generational GC) and the C code (using Boehm GC) generated with my Oberon compiler. The latter only is 20% slower than the native C++98 version, and still twice as fast as Mono 5.

  • Oberon

    Oberon parser, code model & browser, compiler and IDE with debugger

  • > Sure there's a small overhead to smart pointers

    Not so small, and it has the potential to significantly speed down an application when not used wisely. Here are e.g. some measurements where the programmer used C++11 and did everything with smart pointers: https://github.com/smarr/are-we-fast-yet/issues/80#issuecomm.... There was a speed down between factor 2 and 10 compared with the C++98 implementation. Also remember that smart pointers create memory leaks when used with circular references, and there is an additional memory allocation involved with each smart pointer.

    > Garbage collection has an overhead too of course

    The Boehm GC is surprisingly efficient. See e.g. these measurements: https://github.com/rochus-keller/Oberon/blob/master/testcase.... The same benchmark suite as above is compared with different versions of Mono (using the generational GC) and the C code (using Boehm GC) generated with my Oberon compiler. The latter only is 20% slower than the native C++98 version, and still twice as fast as Mono 5.

  • lone

    The standalone Linux Lisp

  • > register scanning isn't portable

    Certainly not but it wasn't particularly hard to implement either. I just wrote some inline assembly for every architecture. Here's my programming language's x86_64 and aarch64 implementations:

    https://github.com/lone-lang/lone/blob/master/architecture/x...

    https://github.com/lone-lang/lone/blob/master/architecture/a...

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • sgcl

    Smart Garbage Collection Library for C++

  • You can look at the SGCL garbage collector for C++: https://github.com/pebal/sgcl. It works in a separate thread, is locks-free and never stops the world.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts