Cluster information:
Kubernetes version: 1.16
Cloud being used: bare-metal
Installation method: kubeadm
Host OS: CentOS7
CNI and version: Weave
I have posted this question to StackOverflow as well:
I have written a Go-based K8s client application to connect with the K8s cluster. To handle the realtime notification from the K8s cluster (add, delete, update) of Pod, Namespace, and Node, I have programmed an informer. The code snippet is below.
I want to bring specific attention to the “runtime.HandleCrash()” function, which (I guess) helps to redirect the runtime panic/errors to the panic file.
// Read the ES config.
panicFile, _ := os.OpenFile("/var/log/panicfile", os.O_WRONLY|os.O_CREATE|os.O_SYNC, 0644)
syscall.Dup2(int(panicFile.Fd()), int(os.Stderr.Fd()))
See some errors below which are reported/collected in the panic file.
My question is: What is the way, I can program informer that it reports/notifies the specific errors to my application rather than writing to a panic file? That way, my application would be able to handle this - expected event - more gracefully.
Is there any way I can register a callback function (similar to Informer.AddEventHandler()).
func (kcv *K8sWorker) armK8sPodListeners() error {
// Kubernetes serves an utility to handle API crashes
defer runtime.HandleCrash()
var sharedInformer = informers.NewSharedInformerFactory(kcv.kubeClient.K8sClient, 0)
// Add watcher for the Pod.
kcv.podInformer = sharedInformer.Core().V1().Pods().Informer()
kcv.podInformerChan = make(chan struct{})
// Pod informer state change handler
kcv.podInformer.AddEventHandler(cache.ResourceEventHandlerFuncs {
// When a new pod gets created
AddFunc: func(obj interface{}) {
kcv.handleAddPod(obj)
},
// When a pod gets updated
UpdateFunc: func(oldObj interface{}, newObj interface{}) {
kcv.handleUpdatePod(oldObj, newObj)
},
// When a pod gets deleted
DeleteFunc: func(obj interface{}) {
kcv.handleDeletePod(obj)
},
})
kcv.nsInformer = sharedInformer.Core().V1().Namespaces().Informer()
kcv.nsInformerChan = make(chan struct{})
// Namespace informer state change handler
kcv.nsInformer.AddEventHandler(cache.ResourceEventHandlerFuncs {
// When a new namespace gets created
AddFunc: func(obj interface{}) {
kcv.handleAddNamespace(obj)
},
// When a namespace gets updated
//UpdateFunc: func(oldObj interface{}, newObj interface{}) {
// kcv.handleUpdateNamespace(oldObj, newObj)
//},
// When a namespace gets deleted
DeleteFunc: func(obj interface{}) {
kcv.handleDeleteNamespace(obj)
},
})
// Add watcher for the Node.
kcv.nodeInformer = sharedInformer.Core().V1().Nodes().Informer()
kcv.nodeInformerChan = make(chan struct{})
// Node informer state change handler
kcv.nodeInformer.AddEventHandler(cache.ResourceEventHandlerFuncs {
// When a new node gets created
AddFunc: func(obj interface{}) {
kcv.handleAddNode(obj)
},
// When a node gets updated
UpdateFunc: func(oldObj interface{}, newObj interface{}) {
kcv.handleUpdateNode(oldObj, newObj)
},
// When a node gets deleted
DeleteFunc: func(obj interface{}) {
kcv.handleDeleteNode(obj)
},
})
// Start the shared informer.
kcv.sharedInformerChan = make(chan struct{})
sharedInformer.Start(kcv.sharedInformerChan)
log.Debug("Shared informer started")
return nil
}
In a specific use case, I shutdown the K8s cluster resulting in an informer throwing error messages into a panic file as below.
The moment I boot up the K8s cluster nodes, it stops reporting these errors.
==== output from "/var/log/panicfile" ======
E0611 16:13:03.558214 10 reflector.go:125] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Pod: Get https://10.30.8.75:6443/api/v1/pods?limit=500&resourceVersion=0: dial tcp 10.30.8.75:6443: connect: no route to host
E0611 16:13:03.558224 10 reflector.go:125] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Namespace: Get https://10.30.8.75:6443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.30.8.75:6443: connect: no route to host
E0611 16:13:03.558246 10 reflector.go:125] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Node: Get https://10.30.8.75:6443/api/v1/nodes?limit=500&resourceVersion=0: dial tcp 10.30.8.75:6443: connect: no route to host