How a Single Line of Code Made a 24-Core Server Slower Than a Laptop

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • rune

    An embeddable dynamic programming language for Rust. (by rune-rs)

  • pure-data

    Pure Data - tracking Miller's SourceForge git repository (also used by libpd) (by Spacechild1)

    Great write up!

    What I like about Pd is that you can freely reblock and resample any subpatch. Want some section with single-sample-feedback? Just put a [block~ 1]. You can also increase the blocksize. Usually, this is done for upsampling and FFT processing. Finally, reblocking can be nested, meaning that you can reblock to 1024 samples and inside have another subpatch running at 1 sample blocksize.

    SuperCollider, on the other hand, has a fixed global blocksize and samplerate, which I think is one of its biggest limitations. (Needless to say, there are many things it does better than Pd!)

    ---

    In the last few days I have been experimenting with adding multi-threading support to Pd (https://github.com/Spacechild1/pure-data/tree/multi-threadin...). With the usual blocksize of 64 sample, you can definitely observe the scheduling overhead in the CPU meter. If you have a few (heavy-weight) subpatches running in parallel, the overhead is neglible. But for [clone] with a high number of (light-weight) copies, the overhead becomes rather noticable. In my quick tests, reblocking to 256 samples already reduces the overhead significantly, at the cost of increased latency, of course.

    ---

    Also, in my plugin host for Pd/Supercollider (https://git.iem.at/pd/vstplugin/) I have a multi-threading and bridging/sandboxing option. If the plugin itself is rather lightweight and the blocksize is small, the scheduling overhead becomes quite noticable. In Pd you can just put [vstplugin~] in a subpatch + [block~]. For the SuperCollider version I have added a "reblock" argument to process the plugin at a higher blocksize, at the cost of increased latency.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • nheko

    Desktop client for Matrix using Qt and C++20.

    > So forcing everyone to think about ownership because maybe they are writing concurrent code (then again maybe they aren't) so that "congrats your memory management problems are solved" seems like a Pyrrhic victory--you've already blown their brain cells on the wrong problem.

    https://manishearth.github.io/blog/2015/05/17/the-problem-wi... argues that "[a]liasing with mutability in a sufficiently complex, single-threaded program is effectively the same thing as accessing data shared across multiple threads without a lock". This is especially true in Qt apps which launch nested event loops, which can do anything and mutate data behind your back, and C++ turns it into use-after-free UB and crashing (https://github.com/Nheko-Reborn/nheko/issues/656, https://github.com/Nheko-Reborn/nheko/commit/570d00b000bd558...). I find Rust code easier to reason about than C++, since I know that unrelated function calls will never modify the target of a &mut T, and can only change the target of a &T if T has interior mutability.

    Nonetheless the increased complexity of Rust is a definite downside for simple/CRUD application code.

    On the other hand, when a programmer does write concurrent code with shared mutability (in any language), in my experience, the only way they'll write correct and understandable code is if they've either learned Rust, or were tutored by someone at the skill level of a Solaris kernel architect. And learning Rust is infinitely more scalable.

    Rust taught me to make concurrency tractable in C++. In Rust, it's standard practice to designate each piece of data as single-threaded, shared but immutable, atomic, or protected by a mutex, and separate single-threaded data and shared data into separate structs. The average C++ programmer who hasn't studied Rust (eg. the developers behind FamiTracker, BambooTracker, RtAudio, and RSS Guard) will write wrong and incomprehensible threading code which mixes atomic fields, data-raced fields, and accessing fields while holding a mutex, sometimes only holding a mutex on the writer but not reader, sometimes switching back and forth between these modes ad-hoc. Sometimes it only races on integer/flag fields and works most of the time on x86 (FamiTracker, BambooTracker, RtAudio), and sometimes it crashes due to a data race on collections (https://github.com/martinrotter/rssguard/issues/362).

  • rssguard

    Feed reader (and podcast player) which supports RSS/ATOM/JSON and many web-based feed services.

    > So forcing everyone to think about ownership because maybe they are writing concurrent code (then again maybe they aren't) so that "congrats your memory management problems are solved" seems like a Pyrrhic victory--you've already blown their brain cells on the wrong problem.

    https://manishearth.github.io/blog/2015/05/17/the-problem-wi... argues that "[a]liasing with mutability in a sufficiently complex, single-threaded program is effectively the same thing as accessing data shared across multiple threads without a lock". This is especially true in Qt apps which launch nested event loops, which can do anything and mutate data behind your back, and C++ turns it into use-after-free UB and crashing (https://github.com/Nheko-Reborn/nheko/issues/656, https://github.com/Nheko-Reborn/nheko/commit/570d00b000bd558...). I find Rust code easier to reason about than C++, since I know that unrelated function calls will never modify the target of a &mut T, and can only change the target of a &T if T has interior mutability.

    Nonetheless the increased complexity of Rust is a definite downside for simple/CRUD application code.

    On the other hand, when a programmer does write concurrent code with shared mutability (in any language), in my experience, the only way they'll write correct and understandable code is if they've either learned Rust, or were tutored by someone at the skill level of a Solaris kernel architect. And learning Rust is infinitely more scalable.

    Rust taught me to make concurrency tractable in C++. In Rust, it's standard practice to designate each piece of data as single-threaded, shared but immutable, atomic, or protected by a mutex, and separate single-threaded data and shared data into separate structs. The average C++ programmer who hasn't studied Rust (eg. the developers behind FamiTracker, BambooTracker, RtAudio, and RSS Guard) will write wrong and incomprehensible threading code which mixes atomic fields, data-raced fields, and accessing fields while holding a mutex, sometimes only holding a mutex on the writer but not reader, sometimes switching back and forth between these modes ad-hoc. Sometimes it only races on integer/flag fields and works most of the time on x86 (FamiTracker, BambooTracker, RtAudio), and sometimes it crashes due to a data race on collections (https://github.com/martinrotter/rssguard/issues/362).

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts