Our great sponsors
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Hi talawahtech. Thanks for the exhaustive article.
I took a short look at the benchmark setup (https://github.com/talawahtech/seastar/blob/http-performance...), and wonder if some simplifications there lead to overinflated performance numbers. The server here executes a single read() on the connection - and as soon as it receives any data it sends back headers. A real world HTTP server needs to read data until all header and body data is consumed before responding.
Now given the benchmark probably sends tiny requests, the server might get everything in a single buffer. However every time it does not, the server will send back two responses to the server - and at that time the client will already have a response for the follow-up request before actually sending it - which overinflates numbers. Might be interesting to re-test with a proper HTTP implementation (at least read until the last 4 bytes received are \r\n\r\n, and assume the benchmark client will never send a body).
Yea, it is definitely a fake HTTP server which I acknowledge in the article [1]. However based on the size of the requests, and my observation of the number of packets per second being symmetrical at the network interface level, I didn't have a concern about doubled responses.
Skipping the parsing of the HTTP requests definitely gives a performance boost, but for this comparison both sides got the same boost, so I didn't mind being less strict. Seastar's HTTP parser was being finicky, so I chose the easy route and just removed it from the equation.
For reference though, in my previous post[2] libreactor was able to hit 1.2M req/s while fully parsing the HTTP requests using picohttpparser[3]. But that is still a very simple and highly optimized implementation. From what I recall when I played with disabling HTTP parsing in libreactor I got a performance boost of about 5%.
1. https://talawah.io/blog/linux-kernel-vs-dpdk-http-performanc...
2. https://talawah.io/blog/extreme-http-performance-tuning-one-...
3. https://github.com/h2o/picohttpparser
IOPOLL is for disks. I think that with very recent kernels you will get busy polling of the socket with just SQPOLL. See here: https://github.com/axboe/liburing/issues/345#issuecomment-10...