Enterprise Backup and Recovery (DR) Options

What are the most current methods for backing up / restoring k8s in the enterprise? I’d like to discuss this at multiple levels of DR (deployments, infrastructure, persistent data, etc).

I’ve come across a few github gists of shell scripts to dump yaml configurations but a lot of them seem to be outdated. I also really like ARK-DR (heptio) but it seems they are more focused on cloud provided k8s and s3 vs. bare metal.

Has anyone in the community found anything cool for backups or maybe someone is working on a project worth sharing? Please share your current strategies as well as your pros / cons to your approach.

1 Like

Hi, I’m the lead for Heptio Ark. We are in the process of updating our documentation to make it clearer that Ark can work anywhere, including bare metal/on-premises.

Ark downloads copies of the Kubernetes objects as JSON and stores all of this in a .tar.gz file. We upload the file to an object store, such as S3, Google Cloud Storage, and Azure Blob Storage. If you have something on-premises that has an API compatible with one of these 3, such as Minio or Ceph for S3, you can have Ark upload the backup files there. It’s also possible to write additional plugins to support uploading into other object stores.

For persistent data, Ark currently supports snapshotting and restoring PersistentVolumes for AWS, GCP, and Azure. This is also plugin-based, so it’s possible to write new plugins to support additional PersistentVolume types.

We are also actively working on integrating Restic into Ark. This will allow us to back up any Kubernetes volume, not just cloud provider ones. You’ll be able to indicate which volumes for which pods you’d like backed up, and we’ll use Restic to upload snapshots of these volumes into object storage.



Thanks Andy! All great info about Ark! My post definitely isn’t a knock on Ark - just looking at what else people are using out there. It’d also be nice to hear of some success stories with users of Ark in production.

P.S. How are you everywhere all at once? lol, keep up the good work!


Hi Jim, no worries at all! I didn’t think you were knocking on Ark. I wanted to make sure it was clear (since our documentation is a bit misleading at the moment) that you can use it anywhere, assuming you have appropriate storage for your backups.

I’ll try to reach out to some of our users and see if they can post some stories here.

P.S. How are you everywhere all at once?

I have @castrojo to thank for telling me about this topic!

Chiming in here because of my devotion to, and love for, Heptio’s ark. I’m running it (or have implemented it) in production for customers on both GKE and AWS. My bare-metal attempts with ark were far more successful on non-air-gapped deployments, but I don’t believe that to be a limitation of ark, any more so than other cloud-native solutions. Being able to ship mirrors of clusters with only a brief install and restore-only definition in the target cluster is a game-changer, and, IMHO, should be the de facto approach for on-demand single-purpose cluster creation. Kudos to Andy, Steve, Nolan, and the entire team at Heptio for a superlative DR and mirroring offering.


Really looking forward Restic integration in Heptio’s Ark ! Any idea when it will be available as test ?

Hi @ValentinNC, we plan to release an alpha of v0.9.0 in the next week or 2 with Restic integration. We would :heart: as much real-world testing as possible!

@jimangel thanks for bringing this up. I lead product at Kasten (https://kasten.io) and we are specifically focused on data management (including backup and recovery) for the enterprise.

At a high level, the K10 platform is our enterprise offering focused on enabling operations at scale through capabilities like application discovery, policy-based management, compliance monitoring, automated workflows, and cross-cluster/environment migrations. In addition to deployments on major cloud-providers, we work with a number clients running K8s on-prem and deploying stateful applications using a variety of common block and object store backends that you find in such environments.

In addition to broad support for volume-level operations, we are focused on enabling application specific operations (think of backup recovery in the context of specific database backends or complex custom applications). The work behind supporting these custom recipes is backed by an open-source project called Kanister (https://kanister.io, which provides and framework for authoring blueprints as well as a repository of frequently used reference blueprints. You can check it out on Github.

Happy to chat more on either of these.


1 Like