Kubernetes 1.8: Hidden Gems - Volume Snapshotting

In this Hidden Gems blog post, Luke looks at the new volume snapshotting functionality in Kubernetes and how cluster administrators can use this feature to take and restore snapshots of their data.

Introduction

In Kubernetes 1.8, volume snapshotting has been released as a prototype. It is external to core Kubernetes whilst it is in the prototype phase, but you can find the project under the snapshot subdirectory of the kubernetes-incubator/external-storage repository. For a detailed explanation of the implementation of volume snapshotting, read the design proposal here. The prototype currently supports GCE PD, AWS EBS, OpenStack Cinder and Kubernetes hostPath volumes. Note that aside from hostPath volumes, the logic for snapshotting a volume is implemented by cloud providers; the purpose of volume snapshotting in Kubernetes is to provide a common API for negotiating with different cloud providers in order to take and restore snapshots.

The best way to get an overview of volume snapshotting in Kubernetes is by going through an example. In this post, we are going to spin up a Kubernetes 1.8 cluster on GKE, deploy snapshot-controller and snapshot-provisioner and take and restore a snapshot of a GCE PD.

For reproducibility, I am using Git commit hash b1d5472a7b47777bf851cfb74bfaf860ad49ed7c of the kubernetes-incubator/external-storage repository.

Package

The first thing we need to do is compile and package both snapshot-controller and snapshot-provisioner into Docker containers. Make sure you have installed Go and configured your GOPATH correctly.

$ go get -d github.com/kubernetes-incubator/external-storage
$ cd $GOPATH/src/github.com/kubernetes-incubator/external-storage/snapshot 
$ # Checkout a fixed revision
$ #git checkout b1d5472a7b47777bf851cfb74bfaf860ad49ed7c
$ GOOS=linux GOARCH=amd64 go build -o _output/bin/snapshot-controller-linux-amd64 cmd/snapshot-controller/snapshot-controller.go
$ GOOS=linux GOARCH=amd64 go build -o _output/bin/snapshot-provisioner-linux-amd64 cmd/snapshot-pv-provisioner/snapshot-pv-provisioner.go

You can then use the following Dockerfiles. These will build both snapshot-controller and snapshot-provisioner. We run apk add --no-cache ca-certificates in order to add root certificates into the container images. To avoid using stale certificates, we could alternatively pass them into the containers by mounting the hostPath /etc/ssl/certs to the same location in the containers.

FROM alpine:3.6

RUN apk add --no-cache ca-certificates

COPY _output/bin/snapshot-controller-linux-amd64 /usr/bin/snapshot-controller

ENTRYPOINT ["/usr/bin/snapshot-controller"]
FROM alpine:3.6

RUN apk add --no-cache ca-certificates

COPY _output/bin/snapshot-provisioner-linux-amd64 /usr/bin/snapshot-provisioner

ENTRYPOINT ["/usr/bin/snapshot-provisioner"]
$ docker build -t dippynark/snapshot-controller:latest . -f Dockerfile.controller
$ docker build -t dippynark/snapshot-provisioner:latest . -f Dockerfile.provisioner
$ docker push dippynark/snapshot-controller:latest
$ docker push dippynark/snapshot-provisioner:latest

Deploy

We will now create a cluster on GKE using gcloud.

$ gcloud container clusters create snapshot-demo --cluster-version 1.8.3-gke.0
Creating cluster snapshot-demo...done.
Created [https://container.googleapis.com/v1/projects/jetstack-sandbox/zones/europe-west1-b/clusters/snapshot-demo].
kubeconfig entry generated for snapshot-demo.
NAME           ZONE            MASTER_VERSION  MASTER_IP      MACHINE_TYPE   NODE_VERSION  NUM_NODES  STATUS
snapshot-demo  europe-west1-b  1.8.3-gke.0     35.205.77.138  n1-standard-1  1.8.3-gke.0   3          RUNNING

Snapshotting requires two extra resources, VolumeSnapshot and VolumeSnapshotData. For an overview of the lifecyle of these two resources, take a look at the user guide in the project itself. We will look at the functionality of each of these resources further down the page, but the first step is to register them with the API Server. This is done using CustomResourceDefinitions. snapshot-controller will create a CustomeResourceDefinition for each of VolumeSnapshot and VolumeSnapshotData when it starts up so some of the work is taken care of for us. snapshot-controller will also watch for VolumeSnapshot resources and take snapshots of the volumes they reference. To allow us to restore our snapshots we will deploy snapshot-provisioner as well.

apiVersion: v1
kind: ServiceAccount
metadata:
  name: snapshot-controller-runner
  namespace: kube-system
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: snapshot-controller-role
rules:
  - apiGroups: [""]
    resources: ["persistentvolumes"]
    verbs: ["get", "list", "watch", "create", "delete"]
  - apiGroups: [""]
    resources: ["persistentvolumeclaims"]
    verbs: ["get", "list", "watch", "update"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["list", "watch", "create", "update", "patch"]
  - apiGroups: ["apiextensions.k8s.io"]
    resources: ["customresourcedefinitions"]
    verbs: ["create", "list", "watch", "delete"]
  - apiGroups: ["volumesnapshot.external-storage.k8s.io"]
    resources: ["volumesnapshots"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
  - apiGroups: ["volumesnapshot.external-storage.k8s.io"]
    resources: ["volumesnapshotdatas"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: snapshot-controller
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: snapshot-controller-role
subjects:
- kind: ServiceAccount
  name: snapshot-controller-runner
  namespace: kube-system
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: snapshot-controller
  namespace: kube-system
spec:
  replicas: 1
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: snapshot-controller
    spec:
      serviceAccountName: snapshot-controller-runner
      containers:
      - name: snapshot-controller
        image: dippynark/snapshot-controller
        imagePullPolicy: Always
        args:
        - -cloudprovider=gce
      - name: snapshot-provisioner
        image: dippynark/snapshot-provisioner
        imagePullPolicy: Always
        args:
        - -cloudprovider=gce

In this case we have specified -cloudprovider=gce, but you can also use aws or openstack depending on your environment. For these other cloud providers there may be other parameters you need to set to configure the neccessary authorisation. Examples of how to do this can be found here. hostPath is enabled by default, but requires you to run snapshot-controller and snapshot-provisioner on the same node as the hostPath volume that you want to snapshot and restore and should only be used on single node development clusters for testing purposes. For an example of how to deploy snapshot-controller and snapshot-provisioner to take and restore hostPath volume snapshots for a particular directory, see here. For a walkthrough of taking and restoring a hostPath volume snapshot see here.

We have also defined a new ServiceAccount to which we have bound a custom ClusterRole. This is only needed for RBAC enabled clusters. If you have not enabled RBAC in your cluster, you can ignore the ServiceAccount, ClusterRole and ClusterRoleBinding and remove the serviceAccountName field from the snapshot-controller Deployment. If you have enabled RBAC in your cluster, notice that we have authorised the ServiceAccount to create, list, watch and delete CustomResourceDefinitions. This is so that snapshot-controller can set them up for our two new resources. Since snapshot-controller only needs these CustomResourceDefinition permissions temporarily on startup, it would be better to remove them and make administrators create the two CustomResourceDefinitions manually. Once snapshort-controller is running, you will be able to see the created CustomResourceDefinitions.

$ kubectl get crd
NAME                                                         AGE
volumesnapshotdatas.volumesnapshot.external-storage.k8s.io   1m
volumesnapshots.volumesnapshot.external-storage.k8s.io       1m

To see the full definitions for these resources you can run kubectl get crd -o yaml. Note that VolumeSnapshot specifies a scope of Namespaced and VolumeSnapshotData is non namespaced. We can now interact with our new resource types.

$ kubectl get volumesnapshot,volumesnapshotdata
No resources found.

Looking at the logs for both snapshot containers we can see that they are working correctly.

$ kubectl get pods -n kube-system
NAME                                                      READY     STATUS    RESTARTS   AGE
...
snapshot-controller-66f7c56c4-h7cpf                       2/2       Running   0          1m
$ kubectl logs snapshot-controller-66f7c56c4-h7cpf -n kube-system -c snapshot-controller
I1104 11:38:53.551581       1 gce.go:348] Using existing Token Source &oauth2.reuseTokenSource{new:google.computeSource{account:""}, mu:sync.Mutex{state:0, sema:0x0}, t:(*oauth2.Token)(nil)}
I1104 11:38:53.553988       1 snapshot-controller.go:127] Register cloudprovider %sgce-pd
I1104 11:38:53.553998       1 snapshot-controller.go:93] starting snapshot controller
I1104 11:38:53.554050       1 snapshot-controller.go:168] Starting snapshot controller
$ kubectl logs snapshot-controller-66f7c56c4-h7cpf -n kube-system -c snapshot-provisioner
I1104 11:38:57.565797       1 gce.go:348] Using existing Token Source &oauth2.reuseTokenSource{new:google.computeSource{account:""}, mu:sync.Mutex{state:0, sema:0x0}, t:(*oauth2.Token)(nil)}
I1104 11:38:57.569374       1 snapshot-pv-provisioner.go:284] Register cloudprovider %sgce-pd
I1104 11:38:57.585940       1 snapshot-pv-provisioner.go:267] starting PV provisioner volumesnapshot.external-storage.k8s.io/snapshot-promoter
I1104 11:38:57.586017       1 controller.go:407] Starting provisioner controller be8211fa-c154-11e7-a1ac-0a580a200004!

Snapshot

Let’s now create the PersistentVolumeClaim we are going to snapshot.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: gce-pvc
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 3Gi

Note that this is using the default StorageClass on GKE which will dynamically provision a GCE PD PersistentVolume. Let’s now create a Pod that will create some data in the volume. We will take a snapshot of the data and restore it later.

apiVersion: v1
kind: Pod
metadata:
  name: busybox
spec:
  restartPolicy: Never
  containers:
  - name: busybox
    image: busybox
    command:
    - "/bin/sh"
    - "-c"
    - "while true; do date >> /tmp/pod-out.txt; sleep 1; done"
    volumeMounts:
    - name: volume
      mountPath: /tmp
  volumes:
  - name: volume
    persistentVolumeClaim:
      claimName: gce-pvc

The Pod appends the current date and time to a file stored on our GCE PD every second. We can use cat to inspect the file.

$ kubectl exec -it busybox cat /tmp/pod-out.txt
Sat Nov  4 11:41:30 UTC 2017
Sat Nov  4 11:41:31 UTC 2017
Sat Nov  4 11:41:32 UTC 2017
Sat Nov  4 11:41:33 UTC 2017
Sat Nov  4 11:41:34 UTC 2017
Sat Nov  4 11:41:35 UTC 2017
$

We are now ready to take a snapshot. Once we create the VolumeSnapshot resource below, snapshot-controller will attempt to create the actual snapshot by interacting with the configured cloud provider (GCE in our case). If successful, the VolumeSnapshot resource is bound to a corresponding VolumeSnapshotData resource. We need to reference the PersistentVolumeClaim that references the data we want to snapshot.

apiVersion: volumesnapshot.external-storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: snapshot-demo
spec:
  persistentVolumeClaimName: gce-pvc
$ kubectl create -f snapshot.yaml 
volumesnapshot "snapshot-demo" created
$ kubectl get volumesnapshot
NAME            AGE
snapshot-demo   18s
$ kubectl describe volumesnapshot snapshot-demo
Name:         snapshot-demo
Namespace:    default
Labels:       SnapshotMetadata-PVName=pvc-048bd424-c155-11e7-8910-42010a840164
              SnapshotMetadata-Timestamp=1509796696232920051
Annotations:  <none>
API Version:  volumesnapshot.external-storage.k8s.io/v1
Kind:         VolumeSnapshot
Metadata:
  Cluster Name:
  Creation Timestamp:  2017-11-04T11:58:16Z
  Generation:          0
  Resource Version:    2348
  Self Link:           /apis/volumesnapshot.external-storage.k8s.io/v1/namespaces/default/volumesnapshots/snapshot-demo
  UID:                 71256cf8-c157-11e7-8910-42010a840164
Spec:
  Persistent Volume Claim Name:  gce-pvc
  Snapshot Data Name:            k8s-volume-snapshot-7193cceb-c157-11e7-8e59-0a580a200004
Status:
  Conditions:
    Last Transition Time:  2017-11-04T11:58:22Z
    Message:               Snapshot is uploading
    Reason:
    Status:                True
    Type:                  Pending
    Last Transition Time:  2017-11-04T11:58:34Z
    Message:               Snapshot created successfully and it is ready
    Reason:
    Status:                True
    Type:                  Ready
  Creation Timestamp:      <nil>
Events:                    <none>

Notice the Snapshot Data Name field. This is a reference to the VolumeSnapshotData resource that was created by snapshot-controller when we created our VolumeSnapshot. The conditions towards the bottom of the output above show that our snapshot was created successfully. We can check snapshot-controller’s logs to verify this.

$ kubectl logs snapshot-controller-66f7c56c4-ptjmb -n kube-system -c snapshot-controller
...
I1104 11:58:34.245845       1 snapshotter.go:239] waitForSnapshot: Snapshot default/snapshot-demo created successfully. Adding it to Actual State of World.
I1104 11:58:34.245853       1 actual_state_of_world.go:74] Adding new snapshot to actual state of world: default/snapshot-demo
I1104 11:58:34.245860       1 snapshotter.go:516] createSnapshot: Snapshot default/snapshot-demo created successfully.

We can also view the snapshot in GCE.

gce snapshot

We can now look at the corresponding VolumeSnapshotData resource that was created.

$ kubectl get volumesnapshotdata
NAME                                                       AGE
k8s-volume-snapshot-7193cceb-c157-11e7-8e59-0a580a200004   3m
$ kubectl describe volumesnapshotdata k8s-volume-snapshot-2a97d3f9-c155-11e7-8e59-0a580a200004
Name:         k8s-volume-snapshot-7193cceb-c157-11e7-8e59-0a580a200004
Namespace:
Labels:       <none>
Annotations:  <none>
API Version:  volumesnapshot.external-storage.k8s.io/v1
Kind:         VolumeSnapshotData
Metadata:
  Cluster Name:
  Creation Timestamp:             2017-11-04T11:58:17Z
  Deletion Grace Period Seconds:  <nil>
  Deletion Timestamp:             <nil>
  Resource Version:               2320
  Self Link:                      /apis/volumesnapshot.external-storage.k8s.io/v1/k8s-volume-snapshot-7193cceb-c157-11e7-8e59-0a580a200004
  UID:                            71a28267-c157-11e7-8910-42010a840164
Spec:
  Gce Persistent Disk:
    Snapshot Id:  pvc-048bd424-c155-11e7-8910-42010a8401641509796696237472729
  Persistent Volume Ref:
    Kind:  PersistentVolume
    Name:  pvc-048bd424-c155-11e7-8910-42010a840164
  Volume Snapshot Ref:
    Kind:  VolumeSnapshot
    Name:  default/snapshot-demo
Status:
  Conditions:
    Last Transition Time:  <nil>
    Message:               Snapshot creation is triggered
    Reason:
    Status:                Unknown
    Type:                  Pending
  Creation Timestamp:      <nil>
Events:                    <none>

Notice the reference to the GCE PD snapshot. It also references the VolumeSnapshot resource we created above and the PersistentVolume that the snapshot has been taken from. This was the PersistentVolume that was dynamically provisioned when we created our gcd-pvc PersistentVolumeClaim earlier. One thing to point out here is that snapshot-controller does not deal with pausing any applications that are interacting with the volume before the snapshot is taken, so the data may be inconsistent if you do not deal with this manually. This will be less of a problem for some applications than others.

The following diagram shows how the various resources discussed above reference each other. We can see how a VolumeSnapshot binds to a VolumeSnapshotData resource. This is analogous to PersistentVolumeClaims and PersistentVolumes. We can also see that VolumeSnapshotData references the actual snapshot taken by the volume provider, in the same way to how a PersistentVolume references the physical volume backing it.

relationship diagram

Restore

Now that we have created a snapshot, we can restore it. To do this we need to create a special StorageClass implemented by snapshot-provisioner. We will then create a PersistentVolumeClaim referencing this StorageClass. An annotation on the PersistentVolumeClaim will inform snapshot-provisioner on where to find the information it needs to negotiate with the cloud provider to restore the snapshot. The StorageClass can be defined as follows.

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: snapshot-promoter
provisioner: volumesnapshot.external-storage.k8s.io/snapshot-promoter
parameters:
  type: pd-standard

Note the provisioner field which tells snapshot-provisioner it needs to implement the StorageClass. We can now create a PersistentVolumeClaim that will use the StorageClass to dynamically provision a PersistentVolume that contains the contents of our snapshot.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: busybox-snapshot
  annotations:
    snapshot.alpha.kubernetes.io/snapshot: snapshot-demo
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 3Gi
  storageClassName: snapshot-promoter

Note the snapshot.alpha.kubernetes.io/snapshot annotation which refers to the VolumeSnapshot we want to use. snapshot-provisioner can use this resource to get all the information it needs to perform the restore. We have also specified snapshot-promoter as the storageClassName which tells snapshot-provisioner that it needs to act. snapshot-provisioner will provision a PersistentVolume containing the contents of the snapshot-demo snapshot. We can see from the STORAGECLASS columns below that the snapshot-promoter StorageClass has been used.

$ kubectl get pvc
NAME               STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS        AGE
...
busybox-snapshot   Bound     pvc-8eed96e4-c157-11e7-8910-42010a840164   3Gi        RWO            snapshot-promoter   11s
...
$ kubectl get pv pvc-8eed96e4-c157-11e7-8910-42010a840164
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                      STORAGECLASS        REASON    AGE
pvc-8eed96e4-c157-11e7-8910-42010a840164   3Gi        RWO            Delete           Bound     default/busybox-snapshot   snapshot-promoter             21s

Checking the snapshot-provisioner logs we can see that the snapshot was restored successfully.

$ kubectl logs snapshot-controller-66f7c56c4-ptjmb -n kube-system -c snapshot-provisioner
...
Provisioning disk pvc-8eed96e4-c157-11e7-8910-42010a840164 from snapshot pvc-048bd424-c155-11e7-8910-42010a8401641509796696237472729, zone europe-west1-b requestGB 3 tags map[source:Created from snapshot pvc-048bd424-c155-11e7-8910-42010a8401641509796696237472729 -dynamic-pvc-8eed96e4-c157-11e7-8910-42010a840164]
...
I1104 11:59:10.563990       1 controller.go:813] volume "pvc-8eed96e4-c157-11e7-8910-42010a840164" for claim "default/busybox-snapshot" created
I1104 11:59:10.987620       1 controller.go:830] volume "pvc-8eed96e4-c157-11e7-8910-42010a840164" for claim "default/busybox-snapshot" saved
I1104 11:59:10.987740       1 controller.go:866] volume "pvc-8eed96e4-c157-11e7-8910-42010a840164" provisioned for claim "default/busybox-snapshot"

Let’s finally mount the busybox-snapshot PersistentVolumeClaim into a Pod to see that the snapshot was restored properly.

apiVersion: v1
kind: Pod
metadata:
  name: busybox-snapshot
spec:
  restartPolicy: Never
  containers:
  - name: busybox
    image: busybox
    command:
    - "/bin/sh"
    - "-c"
    - "while true; do sleep 1; done"
    volumeMounts:
    - name: volume
      mountPath: /tmp
  volumes:
  - name: volume
    persistentVolumeClaim:
      claimName: busybox-snapshot

We can use cat to see the data written to the volume by the busybox pod.

$ kubectl exec -it busybox-snapshot cat /tmp/pod-out.txt
Sat Nov  4 11:41:30 UTC 2017
Sat Nov  4 11:41:31 UTC 2017
Sat Nov  4 11:41:32 UTC 2017
Sat Nov  4 11:41:33 UTC 2017
Sat Nov  4 11:41:34 UTC 2017
Sat Nov  4 11:41:35 UTC 2017
...
Sat Nov  4 11:58:13 UTC 2017
Sat Nov  4 11:58:14 UTC 2017
Sat Nov  4 11:58:15 UTC 2017
$

Notice that since the data is coming from a snapshot, the final date does not change if we run cat repeatedly.

$ kubectl exec -it busybox-snapshot cat /tmp/pod-out.txt
...
Sat Nov  4 11:58:15 UTC 2017
$

Comparing the final date to the creation time of the snapshot in GCE, we can see that the snapshot took about 2 seconds to take.

Clean Up

We can delete the VolumeSnapshot resource which will also delete the corresponding VolumeSnapshotData resource and the snapshot in GCE. This will not affect any PersistentVolumeClaims or PersistentVolumes we have already provisioned using the snapshot. Conversely, deleting any PersistentVolumeClaims or PersistentVolumes that have been used to take a snapshot or have been provisioned using a snapshot will not delete the snapshot itself from GCE, however deleting the PersistentVolumeClaim or PersistentVolume that was used to take a snapshot will prevent you from restoring any further snapshots using snapshot-provisioner.

$ kubectl delete volumesnapshot snapshot-demo
volumesnapshot "snapshot-demo" deleted

We should also delete the busybox Pods so they do not keep checking the date forever.

$ kubectl delete pods busybox busybox-snapshot
pod "busybox" deleted
pod "busybox-snapshot" deleted

For good measure we will also clean up the PersistentVolumeClaims and the cluster itself.

$ kubectl delete pvc busybox-snapshot gce-pvc
persistentvolumeclaim "busybox-snapshot" deleted
persistentvolumeclaim "gce-pvc" deleted
$ yes | gcloud container clusters delete snapshot-demo --async      
The following clusters will be deleted.
 - [snapshot-demo] in [europe-west1-b]

Do you want to continue (Y/n)?  
$

As usual, any GCE PDs you provisioned will not be deleted by deleting the cluster, so make sure to clear those up too if you do not want to be charged.

Conclusion

Although this project is in the early stages, you can instantly see its potential from this simple example and we will hopefully see support for other volume providers very soon as it matures. Together with CronJobs, we now have the primitives we need within Kubernetes to perform automated backups of our data. For submitting any issues or project contributions, the best place to start is the external-storage issues tab.

Tags// ,
comments powered by Disqus