Our great sponsors
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
Hello, I own been scratching my head for 3 straight days without any clue about what is wrong with my cluster. I have a 3 masters k3s cluster (all untainted and can run pods). I use kube-vip as the control-plane load balancer and metallb as the cluster load balancer. I also use the combination of longhorn and NFS mount as the volume solutions for my cluster. Now here's the problem. The scheduler keeps deploying pods in the same node and I don't know why. I have 3 nodes and the other two are idling at 20% cpu and ram usage while the node that keeps getting workloads is always at 80% load or more. I don't even specify any affinity in the manifest at all except my jellyfin deployment because it needs node with Intel GPU. Here's the screenshot. Youcan see all the pods are in kube2 and only jellyfin pod is in kube3, while kube1 is completely free (except running system daemonsets in the kube-system namespace. I also tried adding 4 replicas and guess what all those replicas are being deployed in kube2. https://preview.redd.it/aoplkgs210j61.png?width=930&format=png&auto=webp&s=7cacc84ece83611eb1fa84f9681ff26bfb96e5fe If I cordon kube2, then kubernetes will schedule to kube1 (and keep deploying in that one node again). And the problem is I don't know what is happening. I don't have anything related to node affinity at all in the manifest files. Thanks before.
Hello, I own been scratching my head for 3 straight days without any clue about what is wrong with my cluster. I have a 3 masters k3s cluster (all untainted and can run pods). I use kube-vip as the control-plane load balancer and metallb as the cluster load balancer. I also use the combination of longhorn and NFS mount as the volume solutions for my cluster. Now here's the problem. The scheduler keeps deploying pods in the same node and I don't know why. I have 3 nodes and the other two are idling at 20% cpu and ram usage while the node that keeps getting workloads is always at 80% load or more. I don't even specify any affinity in the manifest at all except my jellyfin deployment because it needs node with Intel GPU. Here's the screenshot. Youcan see all the pods are in kube2 and only jellyfin pod is in kube3, while kube1 is completely free (except running system daemonsets in the kube-system namespace. I also tried adding 4 replicas and guess what all those replicas are being deployed in kube2. https://preview.redd.it/aoplkgs210j61.png?width=930&format=png&auto=webp&s=7cacc84ece83611eb1fa84f9681ff26bfb96e5fe If I cordon kube2, then kubernetes will schedule to kube1 (and keep deploying in that one node again). And the problem is I don't know what is happening. I don't have anything related to node affinity at all in the manifest files. Thanks before.