Customer Operator supporting Node Drain


I’m part of the team helping to build a Kubernetes Operator for OpenSearch.

One of the flow we though about supporting is Kubernetes Worker node drain (kubectl drain). Since when done without any support, there might data loss, or some loss of availability of the data, we wanted to intercept that, respond to it by moving out all data, waiting (this might takes hours in extreme case) and only then proceed with node drain.

I couldn’t find any support for such interception which might take so much time.

I thought about adding a field in the CRD of the controller (OpenSearchCluster kind) to contain list of nodes you wish to drain, and when we see that being added we’ll drain it from OpenSearch perspective (move out data), and only then, the user will see using the Resource Status that it has been successful and can proceed with calling kubectl drain.

Was wondering if you have any input on this?