I’m running an ElasticSearch cluster (StatefulSet) in GKE, and I’d like to write to ES using DataProc, google’s managed Spark cluster. So far I’ve been using a LoadBalancer to expose the ES cluster, but this appears to result in inferior performance – the Spark ES client performs better when it can choose the actual ES node to write to, but hiding ES behind the LoadBalancer makes this impossible.
The only option I can think of is creating one NodePort service per ES node, and pointing the ES client at the node IPs. This seems wasteful, however, and if the IPs change for any reason it will break.
Is there a better way to do what I want? Basically for each pod in the StatefulSet I want a stable IP visible outside the cluster that will consistently route to that pod