AI Infrastructure

Hello,

We are designing an AI architecture for one of our clients on 5 physical HPE ProLiant GEN 11 servers. Two of these servers are equipped with NVIDIA H100 GPUs.

In your opinion, what kind of Kubernetes design should we use? Could you provide recommendations for the Kubernetes version, CNI, and storage solution?