-
thanos
Highly available Prometheus setup with long term storage capabilities. A CNCF Incubating project.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
I understand that Thanos (https://github.com/thanos-io/thanos) was built with the idea of improving prom's scalability and availability , but would love to hear from others that have tried various approaches to try to solve this.
Furthermore, would recommend Grafana Agent OR Prometheus Agent in this case since you probably don't need the Prometheus UI in each Cluster as well as the Alerting stuff that is inside Prometheus. (Mimir will do the ruling stuff for you). Grafana Agent also has an Operator mode if you want to use ServiceMonitor and PodMonitor CustomResources.