zap
libuv
Our great sponsors
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
zap
-
Resource efficient Thread Pools (with Zig)
It is a well-written post on an effective implementation of thread pools — an active area of Rust development — including benchmarks that suggest it may be faster than some of our most popular crates in the area.
-
Lock-free, allocation-free, efficient thread pool
This actually can be at the level of a missed optimization. A run queue with a lock-shared queue amongs all the threads scales even worse than the tokio version. Sharding the run queues and changing the notification algorithm, even while keeping locks on the sharded queues improves throughput drastically.
Tokio is an async runtime, but I don't see why being an async runtime should make it worse from a throughput perspective for a thread pool. I actually started on a Rust version [0] to test out this theory of whether async-rust was the culprit, but realized that I was being nerd-sniped [1] at this point and I should continue my Zig work instead. If you're still interested, I'm open to receiving PRs and questions on that if you want to see that in action.
It's still correct to benchmark and compare tokio here given the scheduler I was designing was mean to be used with async tasks: a bunch of concurrent and small-executing work units. I mention this in the second paragraph of "Why Build Your Own?".
The thread pool in the post is meant to be used to distribute I/O bound work. A friend of mine hooked up cross-platform I/O abstractions to the thread pool [2], benchmarked it against tokio to be have greater throughput and slightly worse tail latency under a local load [3]. The thread pool serves it's purpose and the quicksort benchmark is to show how schedulers behave under relatively concurrent work-loads. I could've used a benchmark with smaller tasks than the cpu-bound partition()/insertion_sort() but this worked as a common example.
I've already mentioned why rayon isn't a good comparison: 1. It doesn't support async root concurrency. 2. scoped() waits for tasks to complete by either blocking the OS thread or using similar inline-scheduler-loop optimizations. This risks stack overflow and isn't available as a use case in other async runtimes due to primarily being a fork-join optimization.
[0]: https://github.com/kprotty/zap/blob/blog-rust/src/thread_poo...
[2]: https://github.com/lithdew/hyperia
[3]: https://gist.github.com/kprotty/5a41e9612657de00788478a7dde4...
This is not the case, it measures the time between two points in the code at runtime and prints that[1].
[1] https://github.com/kprotty/zap/blob/blog/benchmarks/rust/ray...
-
Question: Does Zig has work-stealing/sharing algorithm in the M:N concurrency model ?
You can implement one: https://github.com/kprotty/zap/blob/lifo/src/runtime/Pool.zig
-
Tokio-uring design proposal
BTW If you're interested in work stealing, i'm writing my own which has a bundle of optimizations for minimal task dispatch overhead and memory efficiency. To appease some of your criteria: yes, it's currently being used in "real world production" for an http server (although not that specific version).
-
MEIO: async actors framework
This is a logical fallacy. Specifically either a "Slippery Slope" or "Either/Or". You assume that fast channel implementations must have originated or have been ported to Rust and are both popular. Things like Stakker and zap are anecdotal examples of where this already isn't the case. Even so, there exists fast synchronized channels both inside and outside of async Rust. Because they aren't popular or aren't tuned to efficient runtimes doesn't mean they don't exist, which was my argument.
libuv
- Epoll: The API that powers the modern internet (2022)
-
APIs in Go with Huma 2.0
I wound up on a different team with pre-existing Python code so temporarily shelved my use of Go for a bit, and we used Sanic (an async Python framework built on top of the excellent uvloop & libuv that also powers Node.js) to build some APIs for live channel management & operations. We hand-wrote our OpenAPI and used it to generate documentation and a CLI, which was an improvement over what was there (or not) before. Other teams used the OpenAPI document to generate SDKs to interact with our service.
- Python Is Easy. Go Is Simple. Simple = Easy
-
Notes: Advanced Node.js Concepts by Stephen Grider
In the source code of the Node.js opensource project, lib folder contains JavaScript code, mostly wrappers over C++ and function definitions. On the contrary, src folder contains C++ implementations of the functions, which pulls dependencies from the V8 project, the libuv project, the zlib project, the llhttp project, and many more - which are all placed at the deps folder.
- A Magia do Event Loop
-
What is Node.js?: A Complete Guide
Node.js is written in C, C++, and JavaScript. The core components of Node.js - the V8 engine and the libuv library - are written in C++ and C, respectively, since these languages provide low-level access to system resources, making them well-suited for building high-performance and efficient applications. JavaScript is mainly used to write the application logic.
-
Using Parallel Processing in Node.js and its Limitations
Well, the single-threaded nature ultimately leads to its biggest downfall. Node.js utilizes a synchronous event loop engineered using Libuv that takes in code from the call stack and executes it.
- io_uring support for libuv – 8x increase in throughput
-
7 Tips to Build Scalable Node.js Applications
Node.js executes JavaScript code in a single-threaded model. However, Node.js can function as a multithreaded framework by utilizing the libuv C library to create hidden threads (see the event loop) which handle I/O operations, and network requests asynchronously. But, CPU-intensive tasks such as image or video processing can block the event loop and prevent subsequent requests from executing, increasing the application's latency.
-
Use io_uring for network I/O
Hat's off for posting this 2 hours after it dropped!
I've been tracking the nest of issues with anticipation! This wasn't linked to https://github.com/libuv/libuv/pull/1947 when it posted, so I didn't see it. Very glad you linked it, thanks!
What are some alternatives?
libevent - Event notification library
Boost.Asio - Asio C++ Library
libev - Full-featured high-performance event loop loosely modelled after libevent
tokio-uring - An io_uring backed runtime for Rust
uvw - Header-only, event based, tiny and easy to use libuv wrapper in modern C++ - now available as also shared/static library!
C++ Actor Framework - An Open Source Implementation of the Actor Model in C++
benchmarks - Some benchmarks of different languages
asyncio - asyncio is a c++20 library to write concurrent code using the async/await syntax.
librespot - Open Source Spotify client library
Dasynq - Thread-safe cross-platform event loop library in C++
liburing
alteza - 📔 Super-flexible Static Site Generator