Storing information of external resources in a Kubernetes controller

Hello there.

I’ve got a question about good practices in writing Kubernetes controllers. I’m working on a controller that reconciles external resources (namely Github entities like Webhooks and deploy keys). Once the controller creates an external resource it needs to store their identifier for future actions like updating or deleting them when the related CRD changes. In such cases is there a common practice to store external information in order to allow the controller to retrieve them in a consistent manner? Initially, I’ve considered to make this controller “stateless”, so it would call the Github API at each reconcile iteration to list resources and compare them with the desired state. However, in some cases I’d need to deal with pagination and would be subject to run into rate limits. So, I decided to store the identifiers of those resources in the status subresource. However, I ran into some race conditions in which, apparently, two consecutive reconciliation processes were called before the status being updated, what caused some of those Github entities to be created twice.

I’d like to know if there are common patterns to deal with those cases. I saw that the Kubernetes cluster autoscaller creates a config map to store information about the last observed status of nodes whereas the Amazon Load Balancer ingress stores the ID of the ALB in the resource status. Using an external data source to store this information would be too exotic?

Regards and thanks in advance.