-
openmeter
Cloud Metering for AI, Billing and FinOps. Collect and aggregate millions of usage events in real-time.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
This plugin guarantees exactly-once delivery between Kafka topics and ClickHouse tables, which is critical, as Kafka Connect tasks are only aware of the latest topic offset acknowledged by the consumer. For example, consumers can fail to acknowledge a processed offset due to a network error or an exception. This is great as exactly-once inserts prevent dropping or double-inserting usage, leading to incorrect billing.
In OpenMeter, we pre-aggregate usage events into one-minute tumbling windows to reduce the number of rows we need to scan at query time. To do this, with ClickHouse, we use the AggregatingMergeTree table engine that enables incremental data aggregation when combined with MaterializedView. In ClickHouse, MaterializedViews are trigger-based and update when new records are inserted into the source table. Consequently, the corresponding materialized views are updated whenever Kafka Connect transfers a batch of events to ClickHouse. This also means inserts can fail when the view cannot process a record at trigger. We send failed events into the Dead Letter Queue topic for later processing.
To help ClickHouse with hot topics, we will consider adding an extra streaming aggregation step for high-producers, but this time with a more horizontally scalable stream processor like Arroyo. This would reduce ClickHouse insert batch sizes. Based on our tests, ClickHouse works best if batch sizes are 50-100k and less frequent than per second.
To see it in action, check out our open-source repo: https://github.com/openmeterio/openmeter
Related posts
-
OpenMeter – open-source Realtime Metering
-
Looking for feedback on our website for an Open Source project
-
Real-Time and Scalable Usage Metering
-
GitHub - openmeterio/openmeter: Accurate and real-time usage metering for AI, DevOps, billing and analytics.
-
How to meter POD execution duration for billing?