gl-infra
kube-prometheus
Our great sponsors
gl-infra | kube-prometheus | |
---|---|---|
42 | 41 | |
- | 6,270 | |
- | 2.7% | |
- | 8.8 | |
- | 2 days ago | |
Jsonnet | ||
- | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
gl-infra
- Incident Review for Site-Wide Outage for Gitlab.com – Stale Terraform Pipeline
- Gitlab Friday (July 7) Outage Incident Review
- Gitlab Outage 5 Whys
- GitLab.COM down?
-
Gitlab.com Is Completely Down
GitLab team member here. Thanks for asking.
Incidents can have different types, i.e. when an application bug or performance regression is discovered, this can involve reverting MRs and rolling back releases. The Platform, Delivery group has a top-level responsibility for ensuring continuous delivery of the GitLab application software to GitLab SaaS, https://about.gitlab.com/handbook/engineering/infrastructure...
Other incidents may involve hardware or infrastructure failures, or a combination of both, infrastructure failure that renders GitLab application services unavailable. This requires cross-functional collaboration from infrastructure, product, engineering, etc. teams in the incident.
To get a better understanding here, it is helpful to review the incident management handbook https://about.gitlab.com/handbook/engineering/infrastructure...
Additional helpful information:
- The GitLab.com SaaS production architecture is documented in https://about.gitlab.com/handbook/engineering/infrastructure...
- The Monitoring of GitLab.com handbook provides insights into monitoring workflows, incident management, SLAs, etc. https://about.gitlab.com/handbook/engineering/monitoring/
- Runbooks https://about.gitlab.com/handbook/engineering/infrastructure...
For the current incident discussed in this HN thread, the review issue can be followed in https://gitlab.com/gitlab-com/gl-infra/production/-/issues/1... to learn more.
-
GitLab internal api unreachable
Lol. They let a certificate expire: https://gitlab.com/gitlab-com/gl-infra/production/-/issues/14422
- Is there a security incident ongoing?
- Does Gitlab.com have a security incident?
- fb giresun canli mac izle
- RIZIN LANDMARK 5 in YOYOGI ライブ
kube-prometheus
-
Upgrading Hundreds of Kubernetes Clusters
The last one is mostly an observability stack with Prometheus, Metric server, and Prometheus adapter to have excellent insights into what is happening on the cluster. You can reuse the same stack for autoscaling by repurposing all the data collected for monitoring.
-
Unfork with ArgoCD
kustomize Kube Prometheus
-
Smart-Cash project -Adding monitoring to EKS using Prometheus operator
On the other hand, the Kube-prometheus project provides documentation and scripts to operate end-to-end Kubernetes cluster monitoring using the Prometheus Operator, making easier the process of monitoring the Kubernetes cluster.
-
Scaling Temporal: The Basics
For our load testing we’ve deployed Temporal on Kubernetes, and we’re using MySQL for the persistence backend. The MySQL instance has 4 CPU cores and 32GB RAM, and each Temporal service (Frontend, History, Matching, and Worker) has 2 pods, with requests for 1 CPU core and 1GB RAM as a starting point. We’re not setting CPU limits for our pods—see our upcoming Temporal on Kubernetes post for more details on why. For monitoring we’ll use Prometheus and Grafana, installed via the kube-prometheus stack, giving us some useful Kubernetes metrics.
-
How do you set up Grafana alert for your cluster? Which mixins library?
The 2 most common approaches I have seen are kube-prometheus-stack and kube-prometheus..
-
Issues with "victoria-metrics-k8s-stack", monitoring k8s targets
- I'm missing a lot of the Grafana dashboards that are provisioned during the deployment, not sure why as it has worked before, and wanted to add them after install... I believe it's different ConfigMaps like the one in kube-prometheus but I was wondering if there's a way to force provisioning them all again at once (multiple k8s, node_exporter, vm, etc)?
-
what metrics are most important for checking kubernetes cluster health?
Check out the kube Prometheus project -- https://github.com/prometheus-operator/kube-prometheus It's a bit heavy, but the included recording rules and dashboards give you a great start at understanding your cluster.
-
Easy Prometheus/Grafana Setup With Dashboards Repo
The actual link to the prometheus/grafana bundle: https://github.com/prometheus-operator/kube-prometheus
-
How To Configure Kube-Prometheus
Here’s a list of what’s installed: https://github.com/prometheus-operator/kube-prometheus/tree/main/manifests
- How to install a user managed Prometheus and Grafana instance on OpenShift 4?
What are some alternatives?
www-gitlab-com
metrics-server - Scalable and efficient source of container resource metrics for Kubernetes built-in autoscaling pipelines.
thanos-operator - Kubernetes operator for deploying Thanos
helm-charts - Prometheus community Helm charts
gitlab
prometheus-operator - Prometheus Operator creates/configures/manages Prometheus clusters atop Kubernetes
git2git - Handy library for copying repositories from one git host to another
kube-thanos - Kubernetes specific configuration for deploying Thanos.
gitlab-foss
sloth - 🦥 Easy and simple Prometheus SLO (service level objectives) generator
govuk-infrastructure - Terraform turnup automation for the EKS Kubernetes clusters that host GOV.UK. See https://github.com/alphagov/govuk-helm-charts for application config.
ansible-prometheus - Deploy Prometheus monitoring system