ETCD backup !ssues

Hi

I am using below commands for etcd backup and restore. ofcourse they successfully worked for cluster created by kubeadm however they are not working for the cluster created by Hardway. please provide your suggestions

Backup:

ETCDCTL_API=3 etcdctl snapshot save mysnapshot.db --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key

Restore:

ETCDCTL_API=3 etcdctl snapshot restore mysnapshot.db --name ip-172-31-27-180 --initial-cluster ip-172-31-27-180=https://172.31.27.180:2380 --initial-advertise-peer-urls https://172.31.27.180:2380

List members:
ETCDCTL_API=3 etcdctl member list --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key

ETCDCTL_API=3 etcdctl member list --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key

3 Likes

This doesn’t answer your question directly, but I wouldn’t recommend etcd backup as a way of protecting a Kubernetes cluster. Look at https://stateful.kubernetes.sh/ for other options.

I got it working on mine by creating a new data directory:

sudo ETCDCTL_API=3 etcdctl snapshot restore latest-snapshot-test.db
–name master
–initial-cluster master=http://127.0.0.1:2380
–initial-cluster-token etcd-cluster-1
–initial-advertise-peer-urls http://127.0.0.1:2380
–data-dir=/var/lib/etcd-new

then updating the etcd.service to use this new directory then restarted and it worked. This was just me testing and running through hardway steps

ETCDCTL_API=3 etcdctl --endpoints=https://[127.0.0.1]:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt \
     --name=master \
     --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key \
     --data-dir /var/lib/etcd-from-backup \
     --initial-cluster=master=https://127.0.0.1:2380 \
     --initial-cluster-token etcd-cluster-1 \
     --initial-advertise-peer-urls=https://127.0.0.1:2380 \ 
After running above command then deplend on etcd state where pod or as service make sure to change the config passing initial-cluster-token and data directory path accordingly and restart
     snapshot restore /tmp/snapshot-pre-boot.db
1 Like

FYI for authentication and auth of new users -

  1. Create a CSR with openssl -

     openssl req -new -newkey rsa:4096 -nodes -keyout ops-k8s.key -out ops-k8s.csr -subj "/CN=ops/O=devops"
    

    The above cmd is to create a CSR for user ops

  2. Copy the o/p of

     cat ops-k8s.csr | base64 | tr -d '\n' "
    
  3. Paste in csr.yaml file(it is mentioned where to paste in file)

  4. Create the CSR k8 obj -

     kubectl apply -f csr.yaml
    
  5. Approve the CSR -

     kubectl certificate approve ops-k8s-access
    
  6. now get the certificate that is approved -

     kubectl get csr ops-k8s-access -o jsonpath='{.status.certificate}' | base64 --decode > ops-k8s-access.crt
    
  7. Now to create the kubeconfig, we need CA cert, ops cert(one we have above) -

     kubectl config view -o jsonpath='{.clusters[0].cluster.certificate-authority-data}' --raw | base64 --decode - > k8s-ca.crt
    
  8. Running the below will create a kubeconfig template with current CA’s certificate -

     kubectl config set-cluster $(kubectl config view -o jsonpath='{.clusters[0].name}') --server=$(kubectl config view -o jsonpath='{.clusters[0].cluster.server}') --certificate-authority=k8s-ca.crt --kubeconfig=ops-k8s-config --embed-certs
    
  9. This will add ops cert into the kubeconfig -

     kubectl config set-credentials ops --client-certificate=ops-k8s-access.crt --client-key=ops-k8s.key --embed-certs --kubeconfig=ops-k8s-config
    
  10. This will add context mapping for the ops user

    kubectl config set-context ops --cluster=$(kubectl config view -o jsonpath='{.clusters[0].name}') --namespace=default --user=ops --kubeconfig=ops-k8s-config
    
  11. kubectl config use-context ops --kubeconfig=ops-k8s-config

#Now pass on the “ops-k8s-config” as kubeconfig file for the ops team!

  1. apply role and rolebinding
    kubectl apply -f role.yaml
    kubectl apply -f rolebinding.yaml

Role:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: default
  name: create-pod
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["create", "delete"]

RoleBinding:

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: role-grantor-binding
  namespace: default
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: create-pod
subjects:
- apiGroup: rbac.authorization.k8s.io
  kind: User
  name: ops

CSR object template

apiVersion: certificates.k8s.io/v1beta1
kind: CertificateSigningRequest
metadata:
  name: ops-k8s-access
spec:
  groups:
  - system:authenticated
  request: #replace with output from shell command: cat ops-k8s.csr | base64 | tr -d '\n'
  usages:
  - client auth

ETCDCTL_API=3 etcdctl --endpoints=https://[127.0.0.1]:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --name=master --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key --data-dir /var/lib/etcd-from-backup --initial-cluster=master=https://127.0.0.1:2380 --initial-cluster-token etcd-cluster-1 --initial-advertise-peer-urls=https://127.0.0.1:2380 snapshot restore /tmp/snapshot-pre-boot.db

Hi,

Please check out the following project that simplifies etcd backup. It obviates the need to explicitly create etcd snapshots and also provides the benefit of automatically backing up the snapshot file to any S3 bucket.

https://github.com/catalogicsoftware/kubedr

Just as a full disclosure, the project is released by Catalogic Software where I work.

Thanks,

Raghu

BackUp

ETCDCTL_API=3 etcdctl --endpoints=https://[127.0.0.1]:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt \
     --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key \
     snapshot save /tmp/snapshot-pre-boot.db

Restore

ETCDCTL_API=3 etcdctl --endpoints=https://[127.0.0.1]:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt \
     --name=master \
     --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key \
     --data-dir /var/lib/etcd-from-backup \
     --initial-cluster=master=https://127.0.0.1:2380 \
     --initial-cluster-token etcd-cluster-1 \
     --initial-advertise-peer-urls=https://127.0.0.1:2380 \
     snapshot restore /tmp/snapshot-pre-boot.db

Modify /etc/kubernetes/manifests/etcd.yaml

Update ETCD POD to use the new data directory and cluster token by modifying the pod definition file at /etc/kubernetes/manifests/etcd.yaml . When this file is updated, the ETCD pod is automatically re-created as thisis a static pod placed under the /etc/kubernetes/manifests directory.

Update --data-dir to use new target location

--data-dir=/var/lib/etcd-from-backup

Update new initial-cluster-token to specify new cluster

--initial-cluster-token=etcd-cluster-1

Update volumes and volume mounts to point to new path

    volumeMounts:
    - mountPath: /var/lib/etcd-from-backup
      name: etcd-data
    - mountPath: /etc/kubernetes/pki/etcd
      name: etcd-certs
  hostNetwork: true
  priorityClassName: system-cluster-critical
  volumes:
  - hostPath:
      path: /var/lib/etcd-from-backup
      type: DirectoryOrCreate
    name: etcd-data
  - hostPath:
      path: /etc/kubernetes/pki/etcd
      type: DirectoryOrCreate
    name: etcd-certs
3 Likes

This one worked for me.

ETCDCTL_API=3 etcdctl \

  --endpoints=https://[127.0.0.1]:2379 \

  --cacert=/etc/kubernetes/pki/etcd/ca.crt \

  --name=master \

  --cert=/etc/kubernetes/pki/etcd/server.crt \

  --key=/etc/kubernetes/pki/etcd/erver.key \

  --data-dir /var/lib/etcd-from-backup \

  --initial-cluster=master=https://127.0.0.1:2380 \

  --initial-cluster-token etcd-cluster-1 \

  --initial-advertise-peer-urls=https://127.0.0.1:2380 \

  snapshot restore /tmp/snapshot-pre-boot.db

Thanks you, solution worked for me, Was also able to verify also
ETCDCTL_API=3 etcdctl --write-out=table snapshot status /tmp/snapshot-pre-boot.db

Thank you, i like to have everything on one line:

ETCDCTL_API=3 etcdctl --endpoints=https://[127.0.0.1]:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key snapshot save /tmp/etcd-backup.db

Then to verify:

ETCDCTL_API=3 etcdctl --endpoints=https://[127.0.0.1]:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key snapshot status -w table /tmp/etcd-backup.db

…To complete your procedure

See if the container process is back on

docker ps -a | grep etcd


see if the cluster members have been recreated

ETCDCTL_API=3 etcdctl member list --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key --endpoints=127.0.0.1:2379


see if pods, deployments and services have been recreated

kubectl get pods,svc,deployments

Copy paste the complete script below… Then run the command to verify the same.

cat << EOF > etcd_snapshot_backup.sh

#How can I save the etcd-backup Snapshot in a single command
#Author :Lindos_tech_geeks

echo -n "Please enter the location to save the backup : ";read loc;ETCDCTL_API=3 etcdctl snapshot save $loc --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key
echo "You are verifying the output of the saved snapshot $loc"
ETCDCTL_API=3 etcdctl --write-out=table snapshot status $loc

EOF

#Then run the command

sh etcd_snapshot_backup.sh

If you are using a browser based shell sometimes the cat based creation of file adds some junk characters, in that case please copy only the bold part of command to a vim editor

Sample O/P

master $ vim etcd_snapshot_backup.sh

master $ sh etcd_snapshot_backup.sh

Please enter the location to save the backup : /root/etcdbackup.db
Snapshot saved at /root/etcdbackup.db
You are verifying the output of the saved snapshot /root/etcdbackup.db
±---------±---------±-----------±-----------+
| HASH | REVISION | TOTAL KEYS | TOTAL SIZE |
±---------±---------±-----------±-----------+
| 4743dec6 | 3245 | 1466 | 3.4 MB |
±---------±---------±-----------±-----------+
master $

Here, this works across an etcd cluster of 3 too:

I have created the steps and tested the same when couldn’t find anything on the internet.

Abhinav

Hi Suman,

What is that I am doing incorrectly here?
image
–data-dir is this the value from the backup snapshot saved location?
thanks

Hello, @swaroopcs88
Use server.key and server.crt instead of apiserver-etcd-client.crt and apiserver-etcd-client.key.
server.key and server.crt files located at /etc/kubernetes/pki/etcd/.

thank you, Tej, what is the --data-dir value? will it be the snapshot saved location?
I am not seeing --initial-cluster-token in etcd.yaml file.
please advise.
Update new initial-cluster-token to specify new cluster

--initial-cluster-token=etcd-cluster-1

image

hi all,
can you please check and advise here?
thanks
Swaroop

never mind, I figured it out.
Thank you!

Simple backup and restore that works!

ETCDCTL_API=3 etcdctl --endpoints=https://[127.0.0.1]:2379 --cacert=/opt/ca.crt --cert=/opt/etcd-client.crt --key=/opt/etcd-client.key snapshot save /srv/data/etcd-snapshot.db

ETCDCTL_API=3 etcdctl --endpoints=https://[127.0.0.1]:2379 --cacert=/opt/ca.crt --cert=/opt/KUIN00601/etcd-client.crt --key=/opt/etcd-client.key snapshot restore /srv/data/etcd-snapshot-previous.db