Ask HN: What is your Kubernetes nightmare?

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

kompose

50 9,160 9.0 Go

Convert Compose to Kubernetes

Yes it’s a bit much. When I was beginning with kubernetes I was writing Docker compose files first and then converting them to kubernetes using https://kompose.io/

skaffold

83 14,659 9.2 Go

Easy and Repeatable Kubernetes Development

- Advanced features like profiles and modules for supporting multiple environments
https://skaffold.dev/

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
ingress-nginx

202 16,638 9.6 Go

Ingress-NGINX Controller for Kubernetes

TLDR: Complexity.
The deprecation lifecycle, and running ingress controllers in an automatic scaling group.
The first isn't as much of an issue if you have a (partially) dedicated team for managing your clusters, but can be prohibitively expensive (effort / time-wise) for smaller organisations.
The second highlights a bigger problem in K8s in general. I'll have to give a little background first:
If you run an Nginx ingress controller on a node that's part of an ASG — i.e. a group where nodes can disappear, or increase in number — you will experience service disruption to a small percentage of your requests, every time a scaling event occurs. This is caused by a misalignment between timeout values for your load balancer and Nginx, which can not be fixed:
* https://github.com/kubernetes/ingress-nginx/issues/6281

k9s

126 24,857 9.4 Go

🐶 Kubernetes CLI To Manage Your Clusters In Style!

Try k9s[1], the xray view (:xray [ressource]) shows you nested resources as a tree. I find it very useful (and k9s in general is a fantastic administration tool).
[1]: https://k9scli.io

metallb

78 6,611 9.4 Go

A network load-balancer implementation for Kubernetes using standard routing protocols

> Kubernetes on bare metal is actually pretty easy.
I would not call it easy at all. Last time I tried that a year ago you still needed a special load balancer to get it going (https://metallb.universe.tf). Has this changed?

Portainer

337 28,736 9.8 TypeScript

Making Docker and Kubernetes management easy.

Late to the party, but figured I'd share my own story (some details obviously changed, but hopefully the spirit of the experience remains).
Suppose that you work in an org that successfully ships software in a variety of ways - as regular packaged software that runs on an OS directly (e.g. a .jar that expects a certain JDK version in the VM), or maybe even uses containers sometimes, be it with Nomad, Swarm or something else.
And then a project comes along that needs Kubernetes, because someone else made that choice for you (in some orgs, it might be a requirement from the side of clients, others might want to be able to claim that their software runs on Kubernets, in other cases some dev might be padding their CV and leave) and now you need to deal with its consequences.
But here's the thing - if the organization doesn't have enough buy-in into Kubernetes, it's as if you're starting everything from 0, especially if paying some cloud vendor to give you a managed cluster isn't in the cards, be it because of data storage requirements (even for dev environments), other compliance reasons or even just corporate policy.
So, I might be given a single VM on a server, with 8 GB of RAM for launching 4 or so Java/.NET services, as that is a decent amount of resources for doing things the old way. But now, I need to fit a whole Kubernetes cluster in there, which in most configurations eats resources like there's no tomorrow. Oh, and the colleagues also don't have too much experience working with Kubernetes, so some sort of a helpful UI might be nice to have, except that the org uses RPM distros and there are no resources for an install of OpenShift on that VM.
But how much can I even do with that amount of resources, then? Well, I did manage to get K3s (a certified K8s distro by Rancher) up and running, though my hopes of connecting it with the actual Rancher tool (https://rancher.com/) to act as a good web UI didn't succeed. Mostly because of some weirdness with the cgroups support and Rancher running as a Docker container in many cases, which just kind of broke. I did get Portainer (https://www.portainer.io/) up and running instead, but back then I think there were certain problems with the UI, as it's still very much in active development and gradually receives lots of updates. I might have just gone with Kubernetes dashboard, but admittedly the whole login thing isn't quite as intuitive as the alternatives.
That said, everything kind of broke down for a bit as I needed to setup the ingress. What if you have a wildcard certificate along the lines of .something.else.org.com and want it to be used for all of your apps? Back in the day, you'd just setup Nginx or Apache as your reverse proxy and let it worry about SSL/TLS termination. A duty which is now taken over by Kubernetes, except that by default K3s comes with Traefik as their ingress controller of choice and the documentation isn't exactly stellar.
So for getting this sort of configuration up and running, I needed to think about a HelmChartConfig for Traefik, a ConfigMap which references the secrets, a TLSStore to contain them, as well as creating the actual tls-secrets themselves with the appropriate files off of the file system, which still feels a bit odd and would probably be an utter mess to get particular certificates up and running for some other paths, as well as Let's Encrypt for other ones yet. In short, what previously would have been those very same files living on the file system and a few (dozen?) lines inside of the reverse proxy configuration, is now a distributed mess of abstractions and actions which certainly need some getting used to.
Oh, and Portainer sometimes just gets confused and fails to figure out how to properly setup the routes, though I do have to say that at least MetalLB does its job nicely.
And then? Well, we can't just ship manifests directly, we also need Helm charts! But of course, in addition to writing those and setting up the CI for packaging them, you also need something running to store them, as well as any Docker images that you want. In lieu of going through all of the red tape to set that up on shared infrastructure (which would need cleanup policies, access controls and lots of planning so things don't break for other parties using it), instead I crammed in an instance of Nexus/Artifactory/Harbor/... on that very same server, with the very same resource limits, with deadlines still looming over my head.
But that's not it, for software isn't developed in a vacuum. Throw in all of the regular issues with developing software, like not being 100% clear on each of the configuration values that the apps need (because developers are fallible, of course), changes to what they want to use, problems with DB initialization (of course, still needing an instance of PostgreSQL/MariaDB running on the very same server, which for whatever reason might get used as a shared DB) and so on.
In short, you take a process that already has pain points in most orgs and make it needlessly more complex. There are tangible benefits for using Kubernetes. Once you find a setup that works (personally, Ubuntu LTS or a similar distro, full Rancher install, maybe K3s as the underlying cluster or RKE/K3s/k0s on separate nodes, with Nginx for ingress, or a 100% separately managed ingress) then it's great and the standardization is almost like a superpower (as long as you don't go crazy with CRDs). Yet, you need to pay a certain cost up front.
What could be done to alleviate some of the pain points?
In short, I think that:
  - expect to need a lot more resources than previously: always have a separate node for managing your cluster and put any sorts of tools on it as well (like Portainer/Rancher), but run your app workloads on other nodes (K3s or k0s can still be not too demanding with resources for the most part)

rancher

89 22,517 9.9 Go

Complete container management platform

Late to the party, but figured I'd share my own story (some details obviously changed, but hopefully the spirit of the experience remains).
Suppose that you work in an org that successfully ships software in a variety of ways - as regular packaged software that runs on an OS directly (e.g. a .jar that expects a certain JDK version in the VM), or maybe even uses containers sometimes, be it with Nomad, Swarm or something else.
And then a project comes along that needs Kubernetes, because someone else made that choice for you (in some orgs, it might be a requirement from the side of clients, others might want to be able to claim that their software runs on Kubernets, in other cases some dev might be padding their CV and leave) and now you need to deal with its consequences.
But here's the thing - if the organization doesn't have enough buy-in into Kubernetes, it's as if you're starting everything from 0, especially if paying some cloud vendor to give you a managed cluster isn't in the cards, be it because of data storage requirements (even for dev environments), other compliance reasons or even just corporate policy.
So, I might be given a single VM on a server, with 8 GB of RAM for launching 4 or so Java/.NET services, as that is a decent amount of resources for doing things the old way. But now, I need to fit a whole Kubernetes cluster in there, which in most configurations eats resources like there's no tomorrow. Oh, and the colleagues also don't have too much experience working with Kubernetes, so some sort of a helpful UI might be nice to have, except that the org uses RPM distros and there are no resources for an install of OpenShift on that VM.
But how much can I even do with that amount of resources, then? Well, I did manage to get K3s (a certified K8s distro by Rancher) up and running, though my hopes of connecting it with the actual Rancher tool (https://rancher.com/) to act as a good web UI didn't succeed. Mostly because of some weirdness with the cgroups support and Rancher running as a Docker container in many cases, which just kind of broke. I did get Portainer (https://www.portainer.io/) up and running instead, but back then I think there were certain problems with the UI, as it's still very much in active development and gradually receives lots of updates. I might have just gone with Kubernetes dashboard, but admittedly the whole login thing isn't quite as intuitive as the alternatives.
That said, everything kind of broke down for a bit as I needed to setup the ingress. What if you have a wildcard certificate along the lines of .something.else.org.com and want it to be used for all of your apps? Back in the day, you'd just setup Nginx or Apache as your reverse proxy and let it worry about SSL/TLS termination. A duty which is now taken over by Kubernetes, except that by default K3s comes with Traefik as their ingress controller of choice and the documentation isn't exactly stellar.
So for getting this sort of configuration up and running, I needed to think about a HelmChartConfig for Traefik, a ConfigMap which references the secrets, a TLSStore to contain them, as well as creating the actual tls-secrets themselves with the appropriate files off of the file system, which still feels a bit odd and would probably be an utter mess to get particular certificates up and running for some other paths, as well as Let's Encrypt for other ones yet. In short, what previously would have been those very same files living on the file system and a few (dozen?) lines inside of the reverse proxy configuration, is now a distributed mess of abstractions and actions which certainly need some getting used to.
Oh, and Portainer sometimes just gets confused and fails to figure out how to properly setup the routes, though I do have to say that at least MetalLB does its job nicely.
And then? Well, we can't just ship manifests directly, we also need Helm charts! But of course, in addition to writing those and setting up the CI for packaging them, you also need something running to store them, as well as any Docker images that you want. In lieu of going through all of the red tape to set that up on shared infrastructure (which would need cleanup policies, access controls and lots of planning so things don't break for other parties using it), instead I crammed in an instance of Nexus/Artifactory/Harbor/... on that very same server, with the very same resource limits, with deadlines still looming over my head.
But that's not it, for software isn't developed in a vacuum. Throw in all of the regular issues with developing software, like not being 100% clear on each of the configuration values that the apps need (because developers are fallible, of course), changes to what they want to use, problems with DB initialization (of course, still needing an instance of PostgreSQL/MariaDB running on the very same server, which for whatever reason might get used as a shared DB) and so on.
In short, you take a process that already has pain points in most orgs and make it needlessly more complex. There are tangible benefits for using Kubernetes. Once you find a setup that works (personally, Ubuntu LTS or a similar distro, full Rancher install, maybe K3s as the underlying cluster or RKE/K3s/k0s on separate nodes, with Nginx for ingress, or a 100% separately managed ingress) then it's great and the standardization is almost like a superpower (as long as you don't go crazy with CRDs). Yet, you need to pay a certain cost up front.
What could be done to alleviate some of the pain points?
In short, I think that:
  - expect to need a lot more resources than previously: always have a separate node for managing your cluster and put any sorts of tools on it as well (like Portainer/Rancher), but run your app workloads on other nodes (K3s or k0s can still be not too demanding with resources for the most part)

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
karpenter-provider-aws

46 5,854 9.9 Go

Karpenter is a Kubernetes Node Autoscaler built for flexibility, performance, and simplicity.

I work with the karpenter team. Glad to hear you like it. Would love it if you added your info to our public reference adopters.md file https://github.com/aws/karpenter/blob/main/ADOPTERS.md

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project