Ask HN: How do you monitor your systemd services?

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • Healthchecks

    Open-source cron job and background task monitoring service, written in Python & Django

  • If you are ok with a Saas and if it's just scheduled jobs that you are monitoring, there are a number of monitoring tools where you tell when job completes (with a http request) and a missing ping (after a grace period) means that it failed.

    I think https://deadmanssnitch.com/ may have been the original service for this.

    https://healthchecks.io/ has a fairly generous free tier that I use now.

    There are others that do the same thing Sentry, Uptime Robot, ...

  • ntfy

    Send push notifications to your phone or desktop using PUT/POST

  • Uptime-Kuma [1] with ntfy [2]. Most of my services expose HTTP so I just have Uptime-Kuma monitor that. But if you have something that is not exposed to the public you can still use a "push" type monitor, and in a cron job on your server(s), send heartbeat to it when everything is working.

    [1] https://github.com/louislam/uptime-kuma

    [2] https://ntfy.sh/

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • rkvdns_examples

    Examples for RKVDNS under a more permissive license.

  • In general this evolves to a SIEM-like solution in IT or gets added to the tag menagerie in OT.

    If you're focused on "notifications are bad" note that notifications are push, and pull solutions are possible. Tail logs (or journalctl) and post significant events to Redis (https://github.com/m3047/rkvdns_examples/tree/main/totalizer...) for example.

  • collectd-systemd

    collectd plugin to monitor systemd services

  • This combo does the job for me: grafana + riemann + influxdb and collectd as the main agent. collectd bundles many plugins so you can watch logs, monitor running processes or have something custom [1]. This setup is very light to start with and can scale well (up until you hit influxdb limits :D).

    [1] https://github.com/mbachry/collectd-systemd

  • systemd-utils

    Random systemd utilities

  • I use the `OnFailure` property to trigger a service that emails me for failed services like backups which are run as system timers + service.

    I also use `failure-monitor` which is Python service that monitors `journald`.

    Files on Github for those interested:

    https://github.com/kylemanna/systemd-utils

  • uptime-kuma

    A fancy self-hosted monitoring tool

  • Uptime-Kuma [1] with ntfy [2]. Most of my services expose HTTP so I just have Uptime-Kuma monitor that. But if you have something that is not exposed to the public you can still use a "push" type monitor, and in a cron job on your server(s), send heartbeat to it when everything is working.

    [1] https://github.com/louislam/uptime-kuma

    [2] https://ntfy.sh/

  • Netdata

    The open-source observability platform everyone needs

  • > So I turned to Netdata. A one liner on each server and we had super sexy and fast dashboard for each server. No birds eye view, but fine. I then spent maybe 3-4 days trying to figure out how to get alerting to work (just email, but fine) and get temperature readings (or something like that).

    I work in Netdata. Just wanted to mention that as of last release a parent node will show all children in the agent dashboard so if doing again as of today a parent netdata might have got you the birds eye view as a starting point https://github.com/netdata/netdata/releases/tag/v1.41.0

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts