Ambassador qotm route shows "no healthy upstream"

#1

I am getting issues with a qotm service deployed with ambassdor gateway. What ever I do, I get a statement saying no healthy upstream

0 Likes

#2

I can’t really offer any advice, but you might be able to get a better answer asking in the Ambassador community slack.

0 Likes

#3

Or please provide more logs and debug info.

Although never used ambassador here

0 Likes

#4

The ambassador log while access the API shows:ACCESS [2019-03-26T06:01:18.215Z] “GET /qotm/ HTTP/1.1” 503 UH 0 19 0 - “10.244.1.1” “PostmanRuntime/7.6.0” “708e70f6-5acd-4c8b-8f0c-d4e66e7b67db” “192.168.99.101:32686” “-”.
The ambassador internal routes are all working fine and in those ACCESS logs it shows the target ip address correctly.

0 Likes

#5

And did that request arrive to the target IP? Who is returning that 503 you showed? The backend or ambassador because there was some problem with the backend?

0 Likes

#6

I have observed that it is not a problem with ambassador. But, it could be a problem with the DNS(10.96.0.10) access from any of the pods. I tried this with a busybox and executed nslookup of services like qotm or its ip address which failed. I tried to ping or netcat to DNS but none worked.
So I understood the basic problem as DNS not accessible.
I am adding few more details on how I did the set up.

Kube version : v1.13.3
CNI: Flannel - (quay.io/coreos/flannel:v0.11.0-amd64).
Have a single node and a master for kubernetes.

0 Likes

#7

thanks for replying. I am adding few more data in the discussion thread.

0 Likes

#8

Are you using kubernetes service discovery? If you are not, I’d try setting the pod spec attribute dnsPolicy to default.

If that works fine, then you are hitting a bug (I don’t have it handy, I’m on my phone).

Please try if this happens with dnsPolicy default :slight_smile:

0 Likes

#9

I just tried it with dnsPolicy: Default and result is failure.
I did make this change in the ambassador pod spec as the routing, dns lookup and forwarding has to happen from these pods.
Also to add, the entire setup is running on two centos installation on Oracle Virtual Box with Host Only Network.

0 Likes

#10

But you did see the problem by running a pod and nslookup inside, right? Can you try that with dnsPolicy and report if it fails or not?

0 Likes

#11

I will do so, apologies for the weekend blues…

0 Likes

#12

Nothing to apologise for :slight_smile:

0 Likes

#13

@rata
I tired as below:
started qotm -

apiVersion: v1
kind: Service
metadata:
name: qotm
annotations:
getambassador.io/config: |

apiVersion: ambassador/v1
kind: Mapping
name: qotm_mapping
prefix: /qotm/
service: qotm
spec:
selector:
app: qotm
ports:

  • port: 80
    name: http-qotm
    targetPort: http-api

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: qotm
spec:
replicas: 1
strategy:
type: RollingUpdate
template:
metadata:
labels:
app: qotm
spec:
dnsPolicy: Default
containers:
- name: qotm
image: datawire/qotm:1.2
ports:
- name: http-api
containerPort: 5000
readinessProbe:
httpGet:
path: /health
port: 5000
initialDelaySeconds: 30
periodSeconds: 3
resources:
limits:
cpu: “0.1”
memory: 100Mi
The tried executing commands as below:

  1. [root@k8s-master sony]# kubectl exec qotm-5f7f56569d-nkp7b – cat /etc/resolv.conf
    nameserver 10.91.59.137
    nameserver 10.165.108.1
    nameserver 10.165.108.2
    search
    [root@k8s-master sony]# kubectl exec qotm-5f7f56569d-nkp7b – nslookup 10.108.83.129 (cluster ip of qotm service)

  2. nslookup: can’t resolve ‘(null)’: Name does not resolve
    Name: 10.108.83.129
    Address 1: 10.108.83.129

  3. [root@k8s-master sony]# kubectl exec qotm-5f7f56569d-nkp7b – nslookup 10.108.83.129 10.96.0.1
    Server: 10.96.0.1
    Address 1: 10.96.0.1

    Name: 10.108.83.129
    Address 1: 10.108.83.129

0 Likes

#14

@rata
Below is the result of nc command to DNS from a busy box

/ # cat /etc/resolv.conf
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local ent.bhicorp.com
options ndots:5
/ # nc 10.96.0.10 53
/ # nc 10.96.0.10 53 -v
nc: 10.96.0.10 (10.96.0.10:53): No route to host
/ #

0 Likes

#15

I force redeployed kube-dns and after that when I do - below is the result

/ # nslookup kubernetes
Server: 10.96.0.10
Address: 10.96.0.10:53

Name: kubernetes.default.svc.cluster.local
Address: 10.96.0.1

*** Can’t find kubernetes.svc.cluster.local: No answer
*** Can’t find kubernetes.cluster.local: No answer
*** Can’t find kubernetes.: No answer
*** Can’t find kubernetes.default.svc.cluster.local: No answer
*** Can’t find kubernetes.svc.cluster.local: No answer
*** Can’t find kubernetes.cluster.local: No answer
*** Can’t find kubernetes.: No answer

0 Likes

#16

Your CNI or something is broken, I think :-/

That smells like a network configuration problem for me.

But not sure I can’t help, I don’t have experience with CNIs :frowning:

0 Likes

#17

I am trying my level best. For my understainding so far, it is just the access to DNS from POD which is required. Is there a good document on how DNS lookup happens in kube? I would like to grab more details. I am sure it is some petty issue.

0 Likes

#18

one more to find. I examined the iptables and did a watch on to it to see how the flow happens. first impression is that the iptables work properly.
But, a recent break thru understanding is that, the nc 10.96.0.10 -53 just works from a pod which is deployed in the MASTER node - [ kubectl exec etcd-k8s-master -n kube-system – nc 10.96.0.10 53 -v
10.96.0.10 (10.96.0.10:53) open].

0 Likes

#19

well, this pod has the resolv.conf different than the suual pods.

0 Likes

#20

You can use be using kube-dns or coredns as resolver (they run as pods of the kube-system namespace) and they resolve kubernetes services and if not forward to another DNS server, usually called upstream (for example, to resolve google.com it is forwarded).

If you change the setting we discussed, dnsPolicy IIRC, you don’t use coredns/kubedns, and just use the one specified in the hosts /etc/resolve.conf the pod is running.

So, the weird thing is that you saw the problem using IPs, instead of DNS names too. That would point to a network problem. And if you are using a network overlay, that is probably the most likely culprit.

I think you should continue debugging the network problem at the network overlay level. But not sure what advise to give, as I usually don’t use a network overlay. Sorry

0 Likes