Queues Don't Fix Overload

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

aperture

28 586 9.8 Go

Rate limiting, caching, and request prioritization for modern workloads

I agree that queues can problem especially when misconfigured. But some amount of queuing is necessary, to absorb short spikes in demand vs capacity. Also, queues can be helpful to re-order requests based on criticality which won't be possible with zero queue size - in which case we have to immediately drop a request or admit it without considering it's priority.
I think it is beneficial to re-think how we tune queues. Instead of setting a queue size, we should be tuning the max permissible latency in the queue which is what a request timeout actually is. That way, you stay within the acceptable response time SLA while keeping only the serve-able requests in the queue.
Aperture, an open-source load management platform took this approach. Each request specifies a timeout for which it is willing to stay in the queue. And weighted fair queuing scheduler then allocates the capacity (a request quota or max number of in-flight request) across requests based on the priority and tokens (request heaviness) of each request.
Read more about the WFQ scheduler in Aperture: https://docs.fluxninja.com/concepts/scheduler
Link to Aperture's GitHub: https://github.com/fluxninja/aperture
Would love to hear your thoughts on our approach!

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project