Asking for help? Comment out what you need so we can get more information to help you!
Cluster information:
Kubernetes version: 1.17.14
Cloud being used: Google
Host OS: Ubuntu 18.04
I have an app running in kubernetes, on a couple of pods. I’m trying to improve our deployment experience (we’re using rolling deployment), which is currently causing pains.
What I want to achieve:
- each pod first goes not ready, so it gets no more traffic
- then it will finish the requests it’s processing currently
- then it can be removed
This should all be possible and just work - you create a deployment that contains readiness and liveness probes. The load balancer will pick these up and route traffic accordingly. However, when I test my deployment, I see pods getting requests even when switching to not ready. Specifically, it looks like the load balancer won’t update when a lot of traffic comes in. I can see pods going “not ready” when I signal them - and if they don’t get traffic when they switch state, they will not receive traffic afterwards. But if they’re getting traffic while switching, the load balancer just ignores the state change.
I’m starting to wonder how to handle this, because I can’t see what I’m missing - it must be possible to host a high traffic app on kubernetes with pods going “not ready” without losing tons of requests.
My configurations
Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: my-service
name: my-service
namespace: mine
spec:
replicas: 2
selector:
matchLabels:
app: my-service
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
labels:
app: my-service
env: production
spec:
containers:
- name: my-service
image: IMAGE ID
imagePullPolicy: Always
volumeMounts:
- name: credentials
mountPath: "/run/credentials"
readOnly: true
securityContext:
privileged: true
ports:
- containerPort: 8080
protocol: TCP
lifecycle:
preStop:
exec:
command: ["/opt/app/bin/graceful-shutdown.sh"]
readinessProbe:
httpGet:
path: /ready
port: 8080
periodSeconds: 1
initialDelaySeconds: 5
failureThreshold: 1
livenessProbe:
httpGet:
path: /alive
port: 8080
periodSeconds: 1
failureThreshold: 2
initialDelaySeconds: 60
resources:
requests:
memory: "500M"
cpu: "250m"
dnsPolicy: ClusterFirst
restartPolicy: Always
terminationGracePeriodSeconds: 60
nodeSelector:
cloud.google.com/gke-nodepool: stateful
Service/loadbalancer
apiVersion: v1
kind: Service
metadata:
name: stock-service-loadbalancer
namespace: stock
spec:
selector:
app: stock-service
ports:
- protocol: TCP
port: 80
targetPort: 8080
type: LoadBalancer