"iptables" executable vs. AF_NETLINK sockets in kubernetes pkg/proxy code

cristian · June 19, 2024, 5:46pm

Hi,

Why is kubernetes communicating with the netfilter modules in the kernel by explicitly executing “iptables” and not by sending/receiving commands on AF_NETLINK sockets?

Thank you,
Cristian

cristian · June 29, 2024, 9:55pm

Hi,

no answer so far… I am trying to add some more details to my question.

There are two ways of interacting with the netfilter modules in the kernel:

iptables userspace tool which is an executable available in more or less all Linux distributions; e.g.,:

cco@DEU1145:~$ which iptables
/usr/sbin/iptables
cco@DEU1145:~$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=22.04
DISTRIB_CODENAME=jammy
DISTRIB_DESCRIPTION=“Ubuntu 22.04.4 LTS”

AF_NETLINK sockets see netlink(7).

It looks like kubernetes is interacting with the netfilter kernel modules by explicitly executing “usr/sbin/iptables” on the nodes where the respective iptables rules are needed. This involves one of the exec*() syscalls from the exec() family of syscalls; see also exec(3). It is also well known that exec() syscalls are consuming a lot of system resources.

So, my question is:

Why is kubernetes using exec*() syscalls for interacting with the netfilter modules instead of send(2)/recv(2) syscalls on AF_NETLINK sockets?

Thanks a lot,
Cristian

thockin · June 29, 2024, 10:30pm

The only real answer is “that’s just how it was written”.

In truth, it’s FAR easier to comprehend this way. It’s trivial to try things by hand, and then bring them into the code. The man pages are very complete, unlike the alternative, and lots of examples and help can be found all over the internet. In short, this is the better approach, but I am biased.

I dispute your assertion that exec consumes a lot of resources. It’s never, not once, come up as a problem in 10 years. There are OTHER things that are tricky about using iptables, but not that. The interplay between kube-proxy and OTHER users of iptables has been a long pain point, since the actual kernel API is not as stable as you might hope for.

Topic		Replies	Views
【Help】 Regarding the communication strategy and working principle between k8s service nodeport and iptables General Discussions	13	190	December 9, 2024
How to Migrate from IPVS to NFtables General Discussions network	0	16	January 9, 2026
Identifying iptables commands ran by kubernetes process General Discussions	0	403	May 2, 2022
Kube-proxy running as a pod General Discussions development , network	2	568	April 1, 2023
Documentation of required Kernel config General Discussions docs , minikube	0	541	July 1, 2022

"iptables" executable vs. AF_NETLINK sockets in kubernetes pkg/proxy code

Related topics