What is Unbalanced IRQs Kubernetes Event?

We can see a new Kubernetes Event - Unbalanced IRQs - 5 IRQs with affinity to 1 CPUs.
Could you please share some insights on this event, its root cause, impact on the cluster and any solutions to resolve this warning?

Thanks for your time.

IRQs are how the kernel is notified of hardware events. This has nothing particular to do with Kubernetes.

We started getting this event a few days ago. I tried searching the keywords online (Google, Github Org, node-problem-detector repo, etc.) and got no results other than this thread. Any idea what this means? We are running v1.26.3 on AKS.

Same here, starting 1 or 2 weeks ago:
Reason: UnbalancedIRQs
Message: 9 IRQs with affinity to 1 CPUs
Source: irqbalance-problem-monitor NODENAME
Object: Node/aks-NODENAME

Possibly NPD started reporting this recently? Or maybe your OS was updated and the kernel is routing all IRQs to a single CPU?

We also started receiving these events in the node logs after upgrade AKS from 1.25.6 to 1.26.6.


Events:

|Type   | Reason       | Age                   | From                     | Message                       |
|-------|--------------|-----------------------|--------------------------|-------------------------------|
|Warning|UnbalancedIRQs|100s (x1930 over 6d16h)|irqbalance-problem-monitor|9 IRQs with affinity to 1 CPUs |

We faced the same issue with AKS.
In the cluster resource under “Resource Health” and then “Diagnose and solve Problems” the following was listed:

So it seems like an AKS issue which could be fixed by updating the node images.

Solution is to upgrade the node image to the latest avaialble version.

Incorporated fix for irqbalance #275 a node image upgrade from 202310.19.2 will resolve the unbalanced IRQs

Warnings are displayed because Kubernetes synchronizes with the kernel to determine the availability of resources that can or cannot be assigned to pods.

“node-problem-detector aims to make various node problems visible to the upstream layers in the cluster management stack. It is a daemon that runs on each node, detects node problems and reports them to apiserver. node-problem-detector can either run as a DaemonSet or run standalone. Now it is running as a Kubernetes Addon enabled by default in the GKE cluster. It is also enabled by default in AKS as part of the AKS Linux Extension.”