Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
external-snapshotter
Sidecar container that watches Kubernetes Snapshot CRD objects and triggers CreateSnapshot/DeleteSnapshot against a CSI endpoint.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
Technically I am running k8s_gateway which is just coredns with a plugin since there's a FreeBSD binary on the releases page.
My Kubernetes cluster, deployments, infrastructure provisioning is all available over here on Github.
Deployments: (GitOps with Flux)
Container and Helm chart updates: (Github PRs created by Renovate)
Volume Backups and Recovery: (VolSync backing up to S3)
Using Kubernetes and GitOps has been pretty niche but growing in popularity. If you have the hunger for learning k8s or bored with docker-compose/portainer/rancher, or just want to try I built a template on Github that has a walkthrough on deploying Kubernetes to Ubuntu/Fedora and deploying/managing applications with Flux.
VolSync is a much better option than Velero IMO, Velero was created before GitOps was a thing and it really tries to do too much when all I need is a reliable way to backup and restore PVCs. If your CSI supports volume snapshots, VolSync can use the snapshot-controller to create Volume Snapshots and then mount those as a PVC to a temporary pod to then backup that up to S3. This is really great for backing up PVCs because it's not backing them up from a running workload.
You could write a k8s cronjob around prodrigestivill/postgres-backup to dump a database backup to an nfs mount or also check out kanister.
I've dealt with a lot of issues that are very close to just unplugging a node. Unfortunately on node lost, my stateful workloads using rook-ceph block storage won't migrate over to another node automatically due to an issue with rook. Stateless apps (ingress nginx, etc..) not using rook-ceph block failover to another node just fine. I've kind of accepted this for now and I know Longhorn has a feature that makes this work but I find rook-ceph to be more stable for my workloads.