kube-prometheus
sloth
Our great sponsors
kube-prometheus | sloth | |
---|---|---|
41 | 11 | |
6,248 | 1,947 | |
2.4% | - | |
8.8 | 0.0 | |
4 days ago | about 2 months ago | |
Jsonnet | Go | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
kube-prometheus
-
Upgrading Hundreds of Kubernetes Clusters
The last one is mostly an observability stack with Prometheus, Metric server, and Prometheus adapter to have excellent insights into what is happening on the cluster. You can reuse the same stack for autoscaling by repurposing all the data collected for monitoring.
-
Unfork with ArgoCD
kustomize Kube Prometheus
-
Smart-Cash project -Adding monitoring to EKS using Prometheus operator
On the other hand, the Kube-prometheus project provides documentation and scripts to operate end-to-end Kubernetes cluster monitoring using the Prometheus Operator, making easier the process of monitoring the Kubernetes cluster.
-
Scaling Temporal: The Basics
For our load testing we’ve deployed Temporal on Kubernetes, and we’re using MySQL for the persistence backend. The MySQL instance has 4 CPU cores and 32GB RAM, and each Temporal service (Frontend, History, Matching, and Worker) has 2 pods, with requests for 1 CPU core and 1GB RAM as a starting point. We’re not setting CPU limits for our pods—see our upcoming Temporal on Kubernetes post for more details on why. For monitoring we’ll use Prometheus and Grafana, installed via the kube-prometheus stack, giving us some useful Kubernetes metrics.
-
How do you set up Grafana alert for your cluster? Which mixins library?
The 2 most common approaches I have seen are kube-prometheus-stack and kube-prometheus..
-
Issues with "victoria-metrics-k8s-stack", monitoring k8s targets
- I'm missing a lot of the Grafana dashboards that are provisioned during the deployment, not sure why as it has worked before, and wanted to add them after install... I believe it's different ConfigMaps like the one in kube-prometheus but I was wondering if there's a way to force provisioning them all again at once (multiple k8s, node_exporter, vm, etc)?
-
what metrics are most important for checking kubernetes cluster health?
Check out the kube Prometheus project -- https://github.com/prometheus-operator/kube-prometheus It's a bit heavy, but the included recording rules and dashboards give you a great start at understanding your cluster.
-
Easy Prometheus/Grafana Setup With Dashboards Repo
The actual link to the prometheus/grafana bundle: https://github.com/prometheus-operator/kube-prometheus
-
How To Configure Kube-Prometheus
Here’s a list of what’s installed: https://github.com/prometheus-operator/kube-prometheus/tree/main/manifests
- How to install a user managed Prometheus and Grafana instance on OpenShift 4?
sloth
-
SLOscribe: embed SLO/SLI into GO source code
It’s a CLI that allows developers to embed SLO annotation into GO code as comments and generate Prometheus alert groups when paired with Sloth, https://github.com/slok/sloth.
-
help setting SLIs/SLOs
SLOTH: https://github.com/slok/sloth
-
Observability Mythbusters: Yes, Observability-Landscape-as-Code is a Thing
Note: Although it’s outside of the scope of this post to dig deep into this topic, in case you’re curious, you can check out what an OpenSLO YAML definition looks like here.
- Pyrra v0.3.0 released
-
What you use for observability?
The actual hard part is standardizing all teams on SLI/SLO-based thinking. For that we're looking at tools like Sloth.
- How do you measure the reliability of a Kubernetes platform?
-
Calculating Remaining Error Budget
Have a look at sloth (https://github.com/slok/sloth) which will help you generate SLOs and error budgets given a PromQL query. This might be easier than trying to calculate it yourself. Plus, it's "metrics as code" and OpenSLO spec compliant.
-
openSLO
If you are in k8s and use Prometheus you could take a look at sloth: https://github.com/slok/sloth which can either generate the rules/alerts for you, or can run as an operator and allows you to write SLOs as k8s kinds.
-
SLI/Error Budget Calculators and management
Check out https://github.com/slok/sloth
- SLO calculation
What are some alternatives?
metrics-server - Scalable and efficient source of container resource metrics for Kubernetes built-in autoscaling pipelines.
pyrra - Making SLOs with Prometheus manageable, accessible, and easy to use for everyone!
helm-charts - Prometheus community Helm charts
slo-computer - SLOs, Error windows and alerts are complicated. Here an attempt to make it easy SLO Computer makes setting and monitoring SLOs for all your services intuitively seamless and blazingly fast. Community Support on Discord - https://discord.com/invite/Q3p2EEucx9
prometheus-operator - Prometheus Operator creates/configures/manages Prometheus clusters atop Kubernetes
cloudprober - [Moved to cloudprober/cloudprober] An active monitoring software to detect failures before your customers do.
kube-thanos - Kubernetes specific configuration for deploying Thanos.
OpenSLO - Open specification for defining and expressing service level objectives (SLO)
ansible-prometheus - Deploy Prometheus monitoring system
kube-state-metrics - Add-on agent to generate and expose cluster-level metrics.
descheduler - Descheduler for Kubernetes
mtail - extract internal monitoring data from application logs for collection in a timeseries database