Failed scheduling when using Deployment with host port


#1

Hi Kubernetes community,

I’m running into an issue when deploying a Deployment with host port enabled. Couldn’t find any exact same issues after searching for a while so I’m here:) Environment setup can be simplified to the list below:

  • Kubernetes 1.11.6-gke.2
  • Google Kubernetes Engine
  • A regular cluster with 3 nodes
  • The Deployment has 3 replicas and host port 80 enabled
  • Each node gets 1 pod deployed
  • kubectl apply -f <file> is used to trigger the deployment
  • The manifest file looks like this

apiVersion: apps/v1
kind: Deployment
metadata:
name: helloworld
spec:
replicas: 3
selector:
matchLabels:
name: helloworld
template:
metadata:
labels:
name: helloworld
spec:
containers:
- name: helloworld
image: helloworld:v1
ports:
- containerPort: 80
hostPort: 80

The first time deployment always works as there were no pods in the cluster. However, subsequent deployments (with a different image) couldn’t finish. There is always 1 new pod showing up with “Pending” status and it’ll be stuck there until I manually delete it.

$ kubectl get pods -n helloworld

helloworld-5879f57677-cvbnv 1/1 Running 0 2h
helloworld-5879f57677-cxrbl 1/1 Running 0 2h
helloworld-5879f57677-glsc9 1/1 Running 0 2h
helloworld-5c786bf9fc-29mcc 0/1 Pending 0 1h

kubectl describe this new pod shows something like:

Type Reason Age From Message


Warning FailedScheduling 2s (x108 over 5m) default-scheduler 0/3 nodes are available: 3 node(s) didn’t have free ports for the requested pod ports.

It appears that default-scheduler doesn’t attempt to terminate any running pods first before creating a new pod, which explains why scheduling failed as there is no free port. It puzzles me even more that the problem disappeared after I switched from Deployment to DaemonSet, which ends up with the same effect of having 1 pod per node. Any subsequent deployments to DaemonSet simply go through, for example:

$ kubectl get pods -n helloworld --watch
NAME READY STATUS RESTARTS AGE
helloworld-6h4b4 1/1 Running 0 37s
helloworld-9zvrr 1/1 Running 0 37s
helloworld-hx84p 1/1 Running 0 37s
helloworld-hx84p 1/1 Terminating 0 1m
helloworld-hx84p 0/1 Terminating 0 1m
helloworld-hx84p 0/1 Terminating 0 1m
helloworld-hx84p 0/1 Terminating 0 1m
helloworld-9hslf 0/1 Pending 0 0s
helloworld-9hslf 0/1 ContainerCreating 0 0s
helloworld-9hslf 1/1 Running 0 2s
helloworld-9zvrr 1/1 Terminating 0 1m
helloworld-9zvrr 0/1 Terminating 0 1m
helloworld-9zvrr 0/1 Terminating 0 1m
helloworld-9zvrr 0/1 Terminating 0 1m
helloworld-4f4cc 0/1 Pending 0 0s
helloworld-4f4cc 0/1 ContainerCreating 0 0s
helloworld-4f4cc 1/1 Running 0 2s
helloworld-6h4b4 1/1 Terminating 0 1m
helloworld-6h4b4 0/1 Terminating 0 1m
helloworld-6h4b4 0/1 Terminating 0 1m
helloworld-6h4b4 0/1 Terminating 0 1m
helloworld-6h4b4 0/1 Terminating 0 1m
helloworld-lbqbd 0/1 Pending 0 0s
helloworld-lbqbd 0/1 ContainerCreating 0 0s
helloworld-lbqbd 1/1 Running 0 2s

Note that it’s trying to terminate “helloworld-hx84p” before creating “helloworld-9hslf”

My questions are:

  1. Why Deployment doesn’t work for this scenario (number of pods with host port is equal to number of nodes and each node gets 1 pod) but DaemonSet does?
  2. Is DaemonSet controller v.s. default scheduler related to this issue? DaemonSet - Kubernetes
  3. If the difference is caused by DaemonSet controller v.s. default scheduler, how to handle this scenario with default scheduler?

Any pointers will be of great help to me and my team! Thanks!


#2

You can configure the deployment strategy with different maxUnavailable and different settings.

Make sure you configure it so it first deletes a running pod and then creates a new one. If you do that, I guess it will work.

But please let us know :slight_smile:


#3

Deployments default update strategy (spec.strategy.type) is to use RollingUpdate where it will assume it can deploy more instances and roll through them. This is controllable by via maxSurge and maxUnavailable.

However from your use case it sounds like DaemonSets would be the preferred resource to use. DaemonSets are tailored to instances where you only want to have 1 instance running per host.

With Deployments unless you have something that restricts it to a single instance per host (like your usage of hostPort) you must create an anti-affinity rule . Otherwise it thinks it could potentially schedule them on the same host.


#4

Hey thank you both for the pointers. Super helpful!

I have a followup question related to this – Will DaemonSet end up with the same behavior as Deployment after Kubernetes 1.12 as per DaemonSet - Kubernetes? Please bear with me as I’m quite fuzzy about how RollingUpdate and default scheduler are connected. It seems like RollingUpdate has already been available for DaemonSet even since 1.6 as per Perform a Rolling Update on a DaemonSet - Kubernetes so I assume that RollingUpdate and default scheduler are orthogonal to each other.


#5

Rolling updated with DaemonSets are like a limited version of a Rolling Update with a Deployment. There is no maxSurge just maxUnavailable as you cannot have more than one instance per node. To roll through 1 node at a time you would just set your maxUnavailable to 1.


#6

I verified that maxUnavailable does the trick! Given that its default is 25% the number of replicas is set to 3, the actual number of unavailable pods is resolved to 0 after rounding down. Setting maxUnavailable to either 35% (so it rounds down to 1) or the absolute value 1 will trigger terminating one old pod first which solves the problem. Thanks again!


#7

Great! :slight_smile: