Monitoring Microservices with Prometheus and Grafana

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

thanos

66 12,585 9.6 Go

Highly available Prometheus setup with long term storage capabilities. A CNCF Incubating project.

I really don't get why "scrape the prometheus endpoint" is a go-to now, push model seems to be way less PITA to manage at scale.
> If you get serious about Prometheus, eventually you will want longer data retention, checkout https://thanos.io/
Any idea how it compares with https://victoriametrics.com/ ?
We're slowly looking for a replacement for InfluxDB (as 1.8 is essentially on life support), the low disk footprint is pretty big advantage here.

skywalking

23 23,269 9.5 Java

APM, Application Performance Monitoring System

Personally I've also used Apache Skywalking for a decent out of the box experience: https://skywalking.apache.org/
I've also heard good things about Sentry, though if you need to self-host it, then there's a bit of complexity to deal with: https://sentry.io/welcome/

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
VictoriaMetrics

97 10,868 9.9 Go

VictoriaMetrics: fast, cost-effective monitoring solution and time series database

I really don't get why "scrape the prometheus endpoint" is a go-to now, push model seems to be way less PITA to manage at scale.
> If you get serious about Prometheus, eventually you will want longer data retention, checkout https://thanos.io/
Any idea how it compares with https://victoriametrics.com/ ?
We're slowly looking for a replacement for InfluxDB (as 1.8 is essentially on life support), the low disk footprint is pretty big advantage here.

compliance

2 120 0.0 Go

A set of tests to check compliance with various Prometheus interfaces

Scrape is typically just how people get started and works well for small and medium things, it gets you a long way before you need to consider it.
Prometheus remote_write is what people graduate to, this gets you the rest of the way, and you are correct it's less PITA at scale.
If you're looking for retention your choices are large, there's Cortex (CNCF), Mimir (most Cortex work moved here), Thanos, VictoriaMetrics, TimeScale, Chronosphere, and many others.
All seek to do a similar thing from a distance, they all store metrics (likely from Prometheus) and allow retention and some variety of how to query it (if you want SQL you got it, if you want non-standard functions you go it, if your reads are more important than your writes you got it, if you need a billion active series you got it, etc).
If what you want is "Prometheus but bigger" then the Prometheus project provides a compliance suite that you can run to help you evaluate your options: https://github.com/prometheus/compliance
I work for Grafana Labs, and we have maintainers working for us who have touched Prometheus, Thanos, Cortex and Mimir. Mimir is currently the largest investment we have https://github.com/grafana/mimir and it is 100% compliant with Prometheus (though that is about to be temporarily untrue as Native Histograms is landing in Prometheus soon https://github.com/prometheus/prometheus/milestone/10 and we'll need to add a perfectly compliant support to Mimir to get back to being compliant).

mimir

36 3,719 9.9 Go

Grafana Mimir provides horizontally scalable, highly available, multi-tenant, long-term storage for Prometheus.

Scrape is typically just how people get started and works well for small and medium things, it gets you a long way before you need to consider it.
Prometheus remote_write is what people graduate to, this gets you the rest of the way, and you are correct it's less PITA at scale.
If you're looking for retention your choices are large, there's Cortex (CNCF), Mimir (most Cortex work moved here), Thanos, VictoriaMetrics, TimeScale, Chronosphere, and many others.
All seek to do a similar thing from a distance, they all store metrics (likely from Prometheus) and allow retention and some variety of how to query it (if you want SQL you got it, if you want non-standard functions you go it, if your reads are more important than your writes you got it, if you need a billion active series you got it, etc).
If what you want is "Prometheus but bigger" then the Prometheus project provides a compliance suite that you can run to help you evaluate your options: https://github.com/prometheus/compliance
I work for Grafana Labs, and we have maintainers working for us who have touched Prometheus, Thanos, Cortex and Mimir. Mimir is currently the largest investment we have https://github.com/grafana/mimir and it is 100% compliant with Prometheus (though that is about to be temporarily untrue as Native Histograms is landing in Prometheus soon https://github.com/prometheus/prometheus/milestone/10 and we'll need to add a perfectly compliant support to Mimir to get back to being compliant).

prometheus

381 52,748 9.9 Go

The Prometheus monitoring system and time series database.

Scrape is typically just how people get started and works well for small and medium things, it gets you a long way before you need to consider it.
Prometheus remote_write is what people graduate to, this gets you the rest of the way, and you are correct it's less PITA at scale.
If you're looking for retention your choices are large, there's Cortex (CNCF), Mimir (most Cortex work moved here), Thanos, VictoriaMetrics, TimeScale, Chronosphere, and many others.
All seek to do a similar thing from a distance, they all store metrics (likely from Prometheus) and allow retention and some variety of how to query it (if you want SQL you got it, if you want non-standard functions you go it, if your reads are more important than your writes you got it, if you need a billion active series you got it, etc).
If what you want is "Prometheus but bigger" then the Prometheus project provides a compliance suite that you can run to help you evaluate your options: https://github.com/prometheus/compliance
I work for Grafana Labs, and we have maintainers working for us who have touched Prometheus, Thanos, Cortex and Mimir. Mimir is currently the largest investment we have https://github.com/grafana/mimir and it is 100% compliant with Prometheus (though that is about to be temporarily untrue as Native Histograms is landing in Prometheus soon https://github.com/prometheus/prometheus/milestone/10 and we'll need to add a perfectly compliant support to Mimir to get back to being compliant).

self-hosted

29 7,284 9.1 Shell

Sentry, feature-complete and packaged up for low-volume deployments and proofs-of-concept

> E.g does not allow you to define custom metrics to e.g. monitor resource utilization
I think that might not quite be the case in the latest versions: https://docs.sentry.io/product/performance/metrics/#custom-p...
> In addition to the automatic performance metrics described above, Sentry supports setting custom performance metrics on transactions. Custom performance metrics allow you to define metrics (beyond the ones mentioned above) that are important to your application and send them to Sentry.
> For example, you might want to set a custom metric to track:
> - Total memory usage during a transaction
> - The amount of time being queried
> - Number of times a user performed an action during a transaction
> You define and configure custom metrics in the SDK.
Though for my use cases, Sentry's technical complexity is more of a stumbling block, were I to self-host it: https://github.com/getsentry/self-hosted/blob/master/docker-...

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
coroot

33 3,771 9.2 Go

Coroot is an open-source APM & Observability tool, a DataDog and NewRelic alternative 📊, 🖥️, 👉. Powered by eBPF for rapid insights into system performance. Monitor, analyze, and optimize your infrastructure effortlessly for peak reliability at any scale.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Tools for frontend monitoring with Prometheus

6 projects | dev.to | 9 Apr 2024
Show HN: OneUptime – open-source Datadog Alternative

7 projects | news.ycombinator.com | 2 Apr 2024
4 facets of API monitoring you should implement

3 projects | dev.to | 2 Mar 2024
Root Cause Chronicles: Quivering Queue

5 projects | dev.to | 16 Jan 2024
Start your server remotely

2 projects | /r/selfhosted | 11 Dec 2023

Monitoring Microservices with Prometheus and Grafana

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Prometheus Metrics Monitoring Observability HacktoberFest
Post date: 9 Dec 2022

thanos

skywalking

InfluxDB

VictoriaMetrics

compliance

mimir

prometheus

self-hosted

SaaSHub

coroot

Related posts

Tools for frontend monitoring with Prometheus

Show HN: OneUptime – open-source Datadog Alternative

4 facets of API monitoring you should implement

Root Cause Chronicles: Quivering Queue

Start your server remotely

Monitoring Microservices with Prometheus and Grafana

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com Prometheus Metrics Monitoring Observability HacktoberFest Post date: 9 Dec 2022

Related posts

Tools for frontend monitoring with Prometheus

Show HN: OneUptime – open-source Datadog Alternative

4 facets of API monitoring you should implement

Root Cause Chronicles: Quivering Queue

Start your server remotely

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Prometheus Metrics Monitoring Observability HacktoberFest
Post date: 9 Dec 2022