rook
alertmanager
Our great sponsors
rook | alertmanager | |
---|---|---|
51 | 13 | |
11,905 | 6,270 | |
1.2% | 1.5% | |
9.9 | 9.2 | |
4 days ago | 6 days ago | |
Go | Go | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
rook
-
Ceph: A Journey to 1 TiB/s
I have some experience with Ceph, both for work, and with homelab-y stuff.
First, bear in mind that Ceph is a distributed storage system - so the idea is that you will have multiple nodes.
For learning, you can definitely virtualise it all on a single box - but you'll have a better time with discrete physical machines.
Also, Ceph does prefer physical access to disks (similar to ZFS).
And you do need decent networking connectivity - I think that's the main thing people think of, when they think of high hardware requirements for Ceph. Ideally 10Gbe at the minimum - although more if you want higher performance - there can be a lot of network traffic, particularly with things like backfill. (25Gbps if you can find that gear cheap for homelab - 50Gbps is a technological dead-end. 100Gbps works well).
But honestly, for a homelab, a cheap mini PC or NUC with 10Gbe will work fine, and you should get acceptable performance, and it'll be good for learning.
You can install Ceph directly on bare-metal, or if you want to do the homelab k8s route, you can use Rook (https://rook.io/).
Hope this helps, and good luck! Let me know if you have any other questions.
-
Running stateful workloads on Kubernetes with Rook Ceph
Another option is to leverage a Kubernetes-native distributed storage solution such as Rook Ceph as the storage backend for stateful components running on Kubernetes. This has the benefit of simplifying application configuration while addressing business requirements for data backup and recovery such as the ability to take volume snapshots at a regular interval and perform application-level data recovery in case of a disaster.
-
People who run Nextcloud in Docker: Where do you store your data/files? In a Docker volume, or on a remote server/NAS?
This is beyond your question but might help someone else: I switch from docker-compose to kubernetes for my home lab a while ago. The storage solution I've settled on is Rook. It was a bit of up-front work learning how to get it up but now that it's done my storage is automatically managed by Ceph. I can swap out drives and Ceph basically takes care of everything itself.
-
Rook/Ceph with VM nodes on research cluster?
The stumbling point I am at is I want to use rook.io(Ceph) as my storage solution for the cluster. The Ceph prerequisites are one of the following:
-
Asking for recommendation on remote Kubernetes storage for a small cluster and databases
Have you looked at Rook?
-
Want advice on planned evolution: k3os/Longhorn --> Talos/Ceph, plus Consul and Vault
I've briefly run ceph in an external mode, you can actually use a rook deployment to manage it (sort of). Here is the documentation for doing that. For me it didn't pass my testing phase because I need better networking equipment before I can try that.
-
ATARI is still alive: Atari Partition of Fear
This article explains the data corruption issue happened in Rook in 2021. The root cause lies in an unexpected place and can also occurs in all Ceph environment. It's interesting that Rook had started to encounter this problem recently even though this problem has existed for a long time. It's due to a series of coincidences. I wrote this article because the word "Atari" used in a non-historical context in 2021.
-
How to Deploy and Scale Strapi on a Kubernetes Cluster 2/2
Rook (this is a nice article for Rook NFS)
-
Running on-premise k8s with a small team: possible or potential nightmare?
Storage: Favor any distributed storage you know to start with for Persistent Volumes: Ceph maybe via rook.io, Longhorn if you go rancher etc
-
My completely automated Homelab featuring Kubernetes
I've dealt with a lot of issues that are very close to just unplugging a node. Unfortunately on node lost, my stateful workloads using rook-ceph block storage won't migrate over to another node automatically due to an issue with rook. Stateless apps (ingress nginx, etc..) not using rook-ceph block failover to another node just fine. I've kind of accepted this for now and I know Longhorn has a feature that makes this work but I find rook-ceph to be more stable for my workloads.
alertmanager
-
My Raspberry Pi 4 Dashboard
- Alert Manager
-
Uptime monitoring (~1000 urls)
You could use prometheus as a monitoring tool, blackbox_exporter to "export" the urls to prometheus, alertmanager for notifications, and grafana for nice gui dashboards (and maybe also notifications).
-
Alertmanager with SNS Topic
I found this other example below from this repo https://github.com/prometheus/alertmanager/issues/2559, but it is neither working.
-
Ultra Monitoring with Victoria Metrics
vmalert: executes a list of the given alerting or recording rules against configured data sources. For sending alerting notifications vmalert relies on configured Alertmanager. Recording rules results are persisted via remote write protocol. vmalert is heavily inspired by Prometheus implementation and aims to be compatible with its syntax
-
Can Prometheus act similar to OPC A&E server?
Yes, I believe you can do all of what you're looking for without a UI. The alertmanager api has the ability to register receivers as well as to poll for alerts, silence them, etc: https://github.com/prometheus/alertmanager/blob/main/api/v2/openapi.yaml
-
Customize Pushovert alerts ?
I found it, unfortunately it doesn't help.
-
Part I: EC2 with Prometheus
#cloud-config # environment: ${environment} runcmd: # install AWS CLI, neeeded for downloading of configuration files - | apt-get update && apt-get install unzip -y curl -Lo awscli.zip https://awscli.amazonaws.com/awscli-exe-linux-aarch64.zip unzip awscli.zip ./aws/install rm awscli.zip # install prometheus binary - | curl -Lo prometheus.tar.gz https://github.com/prometheus/prometheus/releases/download/v2.33.1/prometheus-2.33.1.linux-arm64.tar.gz tar -xvf prometheus.tar.gz cp ./prometheus-2.33.1.linux-arm64/prometheus /usr/local/bin/prometheus rm -rf ./prometheus-2.33.1.linux-arm64 rm -rf prometheus.tar.gz # install alertmanager binary - | curl -Lo alertmanager.tar.gz https://github.com/prometheus/alertmanager/releases/download/v0.23.0/alertmanager-0.23.0.linux-arm64.tar.gz tar -xvf alertmanager.tar.gz mv ./alertmanager-0.23.0.linux-arm64/alertmanager /usr/local/bin/alertmanager rm -rf alertmanager-0.23.0.linux-arm64 rm alertmanager.tar.gz # vait for EBS volume - | while [ ! -b $(readlink -f /dev/nvme1n1) ]; do echo "waiting for device /dev/nvme1n1" sleep 5 done # format volume blkid $(readlink -f /dev/nvme1n1) || mkfs -t ext4 $(readlink -f /dev/nvme1n1) # create a mount mkdir -p /data if ! grep "/dev/nvme1n1" /etc/fstab; then echo "/dev/nvme1n1 /data ext4 defaults,discard 0 0" >> /etc/fstab fi # mount volume mount /data # enable and start systemd services - | systemctl daemon-reload systemctl enable prepare-prometheus.service && systemctl start prepare-prometheus.service && sleep 10 systemctl enable prometheus.service && systemctl start prometheus.service systemctl enable alertmanager.service && systemctl start alertmanager.service write_files: - path: /usr/local/bin/prepare-prometheus permissions: '0744' content: | #!/bin/sh mkdir -p /etc/prometheus aws s3 cp s3://${s3_bucket}/prometheus.yaml /etc/prometheus/prometheus.yaml aws s3 cp s3://${s3_bucket}/alertmanager.yaml /etc/prometheus/alertmanager.yaml aws s3 cp s3://${s3_bucket}/prometheus.rules.yaml /etc/prometheus/prometheus.rules.yaml curl -X POST http://localhost:9090/-/reload || true - path: /etc/systemd/system/prepare-prometheus.service content: | [Unit] Description=Prepare prometheus / alertmanager configuration Wants=network-online.target After=network-online.target [Service] Type=oneshot ExecStart=/usr/local/bin/prepare-prometheus # please note data.mount in dependencies - path: /etc/systemd/system/prometheus.service content: | [Unit] Description=Prometheus Wants=network-online.target After=network-online.target data.mount prepare-prometheus.service [Service] Type=simple ExecStart=/usr/local/bin/prometheus \ --config.file /etc/prometheus/prometheus.yaml \ --storage.tsdb.path /data/ \ --web.enable-lifecycle \ --web.console.templates=/etc/prometheus/consoles \ --web.console.libraries=/etc/prometheus/console_libraries \ --enable-feature=remote-write-receiver [Install] WantedBy=multi-user.target - path: /etc/systemd/system/alertmanager.service content: | [Unit] Description=Alert Manager Wants=network-online.target After=network-online.target data.mount prepare-prometheus.service [Service] Type=simple ExecStart=/usr/local/bin/alertmanager \ --config.file /etc/prometheus/alertmanager.yaml \ --storage.path=/data/ [Install] WantedBy=multi-user.target
- Prometheus trigger script on alert
-
Is this a terrible way of getting timezone awareness into my Prometheus alerts?
Prometheus recently added native support for time ranges in the alerting config https://github.com/prometheus/alertmanager/issues/876
-
It took almost a full day, but I finally got a decent homelab diagram :D Feedback is most welcome!
Prometheus)Alertmanager: https://github.com/prometheus/alertmanager | https://prometheus.io/
What are some alternatives?
longhorn - Cloud-Native distributed storage built on and for Kubernetes
Grafana - The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.
ceph-csi - CSI driver for Ceph
loki - Like Prometheus, but for logs.
velero - Backup and migrate Kubernetes applications and their persistent volumes
synology-notifications - Synology notifications service
Nginx Proxy Manager - Docker container for managing Nginx proxy hosts with a simple, powerful interface
node_exporter - Exporter for machine metrics
Ceph - Ceph is a distributed object, block, and file storage platform
NPushOver - Full fledged, async, .Net Pushover client
hub-feedback - Feedback and bug reports for the Docker Hub