Ask HN: Prometheus vs. StatsD / Telegraf

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • prometheus

    The Prometheus monitoring system and time series database.

  • I see a lot of people talk about Prometheus on here and speak about it as though it's the only metrics gathering solution. In this way, it really does seem like it has become the poster child of Hacker News and metrics gathering.

    I've used both Prometheus and the Telegraf / StatsD solutions; and, for a very long time, I've disliked everything from the standard "bugs"[0] in Prometheus to the entire design philosophy of their pull vs Telegraf and similar's push methodology.

    What is the collective's general stance on Prometheus vs Telegraf; and why does the collective tend to end up preferring one over the other?

    [0] For example, Prometheus clients does tend to consider a counter that hasn't been incremented to exist, so if you have an error counter, the sudden existence of the error counter is how you find an error. The 'increase' is 0, though, because it went from not existing to a value of 1. Citation: https://github.com/prometheus/prometheus/issues/1673

    No, it's not technically a "bug", it's how it's designed; but, it speaks to how it's used and the work-arounds are unsatisfactory, in my opinion.

  • pushgateway

    Push acceptor for ephemeral and batch jobs.

  • While I somehow understand Prometheus idea that pull is easier to scale than push I've had a bad luck with it.

    First of all Prometheus doesn't even consider monitoring of long-running jobs other than pull way (which didn't make sense for me). There is push gateway [0] but clients libraries seem to consider it only for short-lived jobs where you can send the metrics at the end [1]. It seems I couldn't "push" from long living jobs trivially

    Second when using it for example with django you have to be careful with how you handle multiprocessing that UWSGI/gunicorn does, see [2] - it has bitten me at leas once.

    Comparing to push model where I can just push metrics to [3] statsd_exporter directly and be done with it, but support for statsd is lacking both in terms of frameworks (everyone seems to be migrating to native clients...) and functionality (you've to do labeling basically manually [4])

    To sum up: Prometheus is really great when it works, until you try to go off-track (intentionally or not, see django [2]) then you see its all undiscovered and immature landscape

    [0] https://github.com/prometheus/pushgateway

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • client_python

    Prometheus instrumentation library for Python applications

  • django-prometheus

    Export Django monitoring metrics for Prometheus.io

  • [2] https://github.com/korfuri/django-prometheus/blob/master/doc...

  • statsd_exporter

    StatsD to Prometheus metrics exporter

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts