- Kubernetes Version: v1.31.1
- AWS Load Balancer Controller Helm Chart: aws-load-balancer-controller-1.11.0 (App version v2.11.0)
- CNI Plugin: Calico (unsure of VXLAN mode)
- Container Runtime: containerd
- Instance Metadata Configuration (IMDSv2):
- HttpTokens: required
- HttpPutResponseHopLimit: 2
- Pod CIDR: 10.233.64.0/18
- Security Group Rules: All opened to
0.0.0.0/0
for both ingress and egress.
Problem:
The AWS Load Balancer Controller Pod fails to start properly. Logs show the following error:
{"level":"error","ts":"2025-01-06T01:09:33Z","logger":"setup","msg":"unable to create controller","controller":"Ingress","error":"Get \"https://10.233.0.1:443/apis/networking.k8s.io/v1\": dial tcp 10.233.0.1:443: i/o timeout"}
Additionally, when I inspect the ServiceAccount associated with the controller, it has no tokens or mountable secrets:
kubectl describe serviceaccount aws-load-balancer-controller -n kube-system
Name: aws-load-balancer-controller
Namespace: kube-system
Labels: <none>
Annotations: eks.amazonaws.com/role-arn: arn:aws:iam::002137xxxxxx:role/AWS_xxxxx_Role
Image pull secrets: <none>
Mountable secrets: <none>
Tokens: <none>
Events: <none>
Debugging Steps Taken:
- IMDSv2 Configuration:
- Verified that HttpPutResponseHopLimit is set to 2.
- Confirmed the metadata options using the AWS CLI.
- Firewall/Security Groups:
- Opened all traffic to
0.0.0.0/0
for both ingress and egress.
- Helm and Kubernetes Versions:
- Helm version: v3.16.3
- Kubernetes client and server versions: v1.31.1
- IAM Role:
- Confirmed that the correct IAM Role is associated with the ServiceAccount and verified its trust relationship.
Installation Steps:
The AWS Load Balancer Controller was installed by following the official installation guide. The steps included:
- Tagging Subnets: Ensured that the subnets were properly tagged for auto-discovery.
- Configuring VPC CNI: Installed
amazon-vpc-cni-k8s
to allow Pods to acquire IPs from the VPC subnets. - Setting Up IMDSv2:
aws ec2 modify-instance-metadata-options --http-put-response-hop-limit 2 --http-tokens required --region <region> --instance-id <instance-id>
- Configuring IAM Role: Attached the required IAM policies to the Kubernetes nodes.
- Opening Ports: Verified that port 9443 was open in the security group rules.
- Installing the Controller via Helm:
helm repo add eks https://aws.github.io/eks-charts
helm repo update
helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
-n kube-system \
--set clusterName=my-cluster \
--set region=ap-northeast-1 \
--set vpcId=vpc-xxxxxxxxxxxxxxx \
--set serviceAccount.create=false \
--set serviceAccount.name=aws-load-balancer-controller
Questions:
- What could be causing the controller to time out when connecting to the Kubernetes API Server?
- Are there additional configurations required for the AWS Load Balancer Controller in a self-managed Kubernetes cluster (not EKS)?
- Could Calico’s VXLAN or BGP configuration (if applicable) be causing the issue? How can I verify whether VXLAN is being used?