Peredvizhnikov Engine is a fully lock-free game engine written in C++20

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • peredvizhnikov-engine

    A fully lock-free game engine written in C++20

  • The paper goes into more detail without forcing you to go into the impl but it does seem to be a decent bit more advanced than that.

    https://github.com/eduard-permyakov/peredvizhnikov-engine/bl...

  • Medo

    Haiku Media Editor

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • coroactors

    Experimental actors with C++ coroutines

  • When you protect an std::deque with a mutex you would need at least two atomic operations: to lock the queue before pushing, and to unlock the queue after pushing. Because you're using an std::deque it may need to allocate memory during a push, which would happen under the lock, which makes it more likely for a thread to suspend with the lock taken. While the queue is locked other threads will have to wait, possibly even suspend on a futex, and then the unlocking thread would have to wake another thread up.

    The most expensive part of any mutex/futex is not locking, it's waking other threads up when the lock is contended. I'm actually surprised you only get 10 million messages per second, is that for a contended or an uncontended case? I would expect more, but it probably depends on the hardware a lot, these numbers are hard to compare.

    My actor framework currently uses a lockfree intrusive mailbox [1]_, which consists of exactly two atomic exchange operations, so pushing a node is probably cheaper than with a mutex. But the nicest part about it is how I found a way to make it "edge triggered". A currently unowned (empty) queue is locked by the first push (almost for free, compared to a classic intrusive mpsc queue [2]_ the second part of push uses an exchange instead of a store), which may start dequeueing nodes or schedule it to an executor. The mailbox will stay locked until it is drained completely, after which it is guaranteed that a concurrent (or some future) push will lock it. This enables very efficient wakeups (or even eliding them completely when performing symmetric transfer between actors).

    I actually get ~10 million requests/s in a single-threaded uncontended case (that's at least one allocation per request and two actor context switches: a push into the target mailbox, and a push into the requester mailbox on the way back, plus a couple of steady_clock::now() calls when measuring latency of each request and checking for soft preemption during context switches). Even when heavily contended (thousands of actors call the same actor from multiple threads) I still get ~3 million requests/s. These numbers may vary depending on hardware though, so like I said it's hard to compare.

    In conclusion it very much depends on how lockfree queues are actually used, and how they are implemented, they can be faster and more scalable than a mutex (mutex is a lockfree data structure underneath anyway).

    I'd agree with you in that mutexes are better when protecting complex logic or data structures however, because using lockfree interactions to make it "scalable" often makes the base performance so low, that you'd maybe need thousands of cores to justify the resulting overhead.

    .. [1] https://github.com/snaury/coroactors/blob/a599cc061d754eefea...

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts