Fast IPC between multiple pods on same node

Richard_Strong · February 6, 2020, 1:42am

Hello,

I have a single-node cluster that has 2 pods for now, but it may grow up to 5 pods. As part of a hard business requirement, I have to run each of my apps in their own dedicated pods (i.e. not allowed to stuff multiple containers into the same pod, or multiple apps into the same container/image). I need to put in place a high-speed, low-latency means of inter-process communication between 2 pods initially, but might need to support up to 5 pods each talking directly to each other. The idea is to have one of the pods act as a “re-assmbler”, that takes data from all the other pods, plus its own data, and re-assembles data a specific way into a stream.

What methods could I use to provide something faster than IP-based connectivity between these pods, given that they’re all guaranteed to live on the same node? I’ve tried packet-based connectivity, and it’s just too slow for the amount of data I have to process (given the hard requirements I have).

Thoughts so far:

Create a pair of named pipes (i.e. mkfifo) on the bare-metal OS, and expose it as a volume mount to the two pods, and they can both talk to each other via the pipes. Should be fast, and not too hard to synchronize. Becomes ugly with 5x pods though, as the number of pipes will grow (i.e. (n)(n-1)/2 == (5)(4)/2 == 10 pipes; and I have to figure out a sane way for pods to know which pipe sets to use for read-versus-write).
Shared memory?
Deploy redis or Memached, but I don’t know how the performance/throughput would scale compared to pipes or shared memory.
Some other mechanism I haven’t considered?

Thank you!

Cluster information:

Kubernetes version: 1.17.2
Cloud being used: bare-metal (kubeadm)
Installation method: apt-get
Host OS: Ubuntu Server 18.04 LTS x86_64
CNI and version: Flannel 0.3.1

stephendotcarter · February 6, 2020, 10:27am

Disclaimer: I have no experience doing what you are trying to do

Named pipes - This seems like awkward complexity outside of K8s to get a worker node in to a state to run the pods.
Shared memory - containers within a pod can share memory based on what I’ve read, but not between different pods.
Redis/memcache - Assuming you would need IP connectivity to these (even if running in another pod) which you have already stated is not meeting performance requirements.

Is the business requirement some kind of company wide policy or specific to your project?
Because it seems to me like you have a valid technical requirement for multiple containers per pod which is a common K8s pattern.

Kind regards,
Stephen

Richard_Strong · February 6, 2020, 1:23pm

@stephendotcarter It’s a legal + customer hard requirement (i.e. already raised the topic of multiple containers in the same pod, or moving all the apps into a single container: answer was no).

I don’t believe shared memory will work between pods.

As for redis/memcache, I’m not overly familiar with them, but a colleague suggested it as an alternative to the trivial/simplistic buffering client/server code I’ve written (i.e. much more efficient data-over-IP implementation).

stephendotcarter · February 6, 2020, 3:37pm

Understood about the hard requirements

As far as Redis and Memcached are concerned, I would have thought they are more suited to storing and retrieving data whereas your requirements sounds like a direct stream between processes would be required for performance.

Are these streams something like video where its one continuous stream of binary data? or are they more like a stream of individual events?

thockin · February 6, 2020, 3:54pm

Some requirements are not easily fulfillable.

In this case you might be able to use the host IPC namespace to SHM across pods, but that’s a fairly privileged operation which puts the stability of the node in jeopardy. Maybe hostPath mounting a tmpfs? I have not tried that.

Richard_Strong · February 8, 2020, 1:54am

It’s a set of independent streams of encrypted data being decrypted by different pods (each one offloads to a different GPU/ASIC/FPGA to do different types of decryption). The data needs to be recombined (basically a glorified interleaving algorithm) in a odd/proprietary manner. It can be treated as a giant binary stream with sequence numbers.

Richard_Strong · February 8, 2020, 1:56am

I considered using a tmpfs via hostPath mounting, but I couldn’t figure out how to properly guarantee atomic reads/writes between pods. Maybe something like a Boost::NamedMutex or something similar (i.e. filesystem mutex), but I was hoping for something more generic (some of my apps are written in C, C++, Python, Go, node.js), and I need a solution for all of them, so Boost isn’t a magic wand in my case.

thockin · February 8, 2020, 3:20am

How would you do this without Kubernetes?

nmvc · October 12, 2021, 9:20pm

Random reply here (came across this when Googling something similar-ish); but for an answer to the ‘how to use filesystems’ approach;

File renames on Linux systems are atomic on the same mount point; this means if you write to a new file, then rename the new file onto the old; a given process will only see old or new, never a mixed state between the two. You need then only choose when they (re-)open the file.

Fahd_Arshad · August 16, 2024, 3:54pm

Bringing this thread back alive, any possible solutions to this yet?

Topic		Replies	Views
Inter process communication(shared memory) between pods microk8s docs , development , minikube , podcast , network	3	9439	September 3, 2020
Sharing shared memory between pods General Discussions	6	5892	July 9, 2020
Shared memory between pods General Discussions	0	419	November 21, 2023
Questions on scaling of pods, and cache General Discussions	2	2965	October 3, 2019
Connect Pods from one cluster to a pod in aother cluster without using k8s service General Discussions	3	747	November 25, 2019

Fast IPC between multiple pods on same node

Cluster information:

Related topics