-
consul
Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Like many applications, our infrastructure relies on queues to decouple various components. In our system we use AWS Kinesis as a data stream, consumed by Broadway consumers for some critical parts of our infrastructure. We have found that sometimes our Broadway consumers for AWS Kinesis fail in ways that do not gracefully recover when they crash. For example, each Kinesis shard has its own supervision tree managed by the Kinesis Broadway consumer. We found that if a shard consumer experienced a crash-inducing error, the shard would not restart and the crash would not cascade up to the Broadway producer. While we have worked on contributing to this consumer library, we decided that it would be important to have runtime control over stopping and starting consumers to respond to such failures just in case.
We knew that we needed a way, at runtime, to start and stop these queue consumers. Although we could have reached for other configuration management tools like Hashicorp Consul or AWS AppConfig, we already use LaunchDarkly at Knock to control the runtime behavior of our frontend and backend applications. LaunchDarkly feature flags seemed like a great way to control this starting and stopping process without adding new dependencies or complexity to our stack.
At Knock, we use LaunchDarkly to power feature flags. Feature flags are powerful because they enable us to control, at runtime, the behavior of different parts of the system. Most of the time, this means controlling access to features on our client-side application, or controlling the rollout of new features across our application. However, we recently adopted a circuit breaker pattern built around feature flags. This pattern helps our services be more reliable when things fail.
Like many applications, our infrastructure relies on queues to decouple various components. In our system we use AWS Kinesis as a data stream, consumed by Broadway consumers for some critical parts of our infrastructure. We have found that sometimes our Broadway consumers for AWS Kinesis fail in ways that do not gracefully recover when they crash. For example, each Kinesis shard has its own supervision tree managed by the Kinesis Broadway consumer. We found that if a shard consumer experienced a crash-inducing error, the shard would not restart and the crash would not cascade up to the Broadway producer. While we have worked on contributing to this consumer library, we decided that it would be important to have runtime control over stopping and starting consumers to respond to such failures just in case.