-
alertmanager-status
A small app to let an external monitoring service know whether or not your Alertmanager instance is working
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
I dunno, I don't really mind self-hosting monitoring infrastructure. I basically pay for a website uptime checker to check that Alertmanager is working. If Alertmanager is down, obviously you have to manually check to see what else is down, but it doesn't fail open.
I wrote a little glue to make this straightforward for anyone else who uses Prometheus/Alertmanager: https://github.com/jrockway/alertmanager-status This ensures that the website check checks the health of the whole alerting pipeline; Prometheus has an always firing alert, Alertmanager is set to send that alert to alertmanager-status, and alertmanager-status starts failing its external health check if it isn't seeing that alert firing at the configured interval. If one of [Prometheus, Alertmanager, alertmanager-status] fails, then your website health check fails.