As soon as you have set up your Kubernetes cluster and deployed a few containers, the question arises how to create backups. A very useful and comfortable tool for this is Velero. It can be easily installed via HELM and you can create backups and restores by a two single commands. Important to note is that Velero creates snapshots of the PVs (persistent volumes) by default and not all storage classes support this. For example the nfs-subdir-external-provisioner or K3S local-path storage classes do not support snapshots. I will be using K3s and the local-path storage class in my tutorial. For this Velero offers the possibility to backup the PVs via RESTIC and Kopia. For more information read here:

Velero Docs - File System Backup

You just have to activate it during setup. But more about that later. You can schedule backups, set retention periods and much more. All backups are stored in an S3 bucket. So you can either rent cloud storage or run a small MinIO server locally.

Please note that I'm running this on my local network but you can of course run this in your VPS or Cloud instance. But make sure to use HTTPS with a trusted certificate and some very strong passwords and additional mechanisms to ensure safety and integrity of your backups! Let's start with the S3 storage setup.

Setup and installation of S3 Backup storage

As long as your storage is S3 compatible you can link it to Velero. We have various options here. One of the smallest and simplest would be to run MinIO on some VM. I would not recommend this for production but depending on the size of your containers it might be just fine. In my case I have a small TrueNAS Scale server running. You can check out how I installed it in this Blog post. Here we can easily activate S3 storage. This will also just start a MinIO server by the way. To set this up I just followed the official TrueNAS Scale S3 guide. For this we will of course need a dataset. So we log in on TrueNAS Scale and first create a clean new dataset. For this we go to Storage, select the 3 dots next to our zpool and click on "Add Dataset":

I call mine "minio" and accept all the defaults. Once we have the dataset, we need to activate the S3 service. For this we go to System Settings -> Services. Here we select the pen icon on the right side next to S3:

Here we need to select the dataset that we just created, confirm the warning message, as well as enter an access and secret key. Note both down as we will need them later to log in to the MinIO console and also to configure Velero. To enable HTTPS, which I highly recommend, select the default certificate from the dropdown, select a valid certificate that you imported earlier and enter the IP of your TrueNAS server in the URI field. For this tutorial we will not use HTTPS. Finally we click on save and activate the S3 service. It is also a good idea to tick the "Start Automatically" box.

Now we create a bucket called "velero". For this we browse to the MinIO Console on our TrueNAS server. MinIO runs on port 9000 / 9001. In my case that is:

http://192.168.2.2:9000

We log in using the access and secret key that we just configured in the TrueNAS WebUI.

In the MinIO Console we click on Buckets and Create Bucket

Let's see if we can connect to the S3 bucket. For this we will use awscli

sudo yum -y install awscli

To configure awscli we run

aws configure --profile=truenas

A wizard will now ask for access and secret key. We accept the defaults for Region etc. by just pressing ENTER

[vagrant@rocky8-k3s ~]$ aws configure --profile=truenas
AWS Access Key ID [None]: B9A6AC13381DAC41BAD
AWS Secret Access Key [None]: 42D75A75B3
Default region name [None]:
Default output format [None]:

Now we can list the contents of the S3 bucket by running

aws --profile=truenas --endpoint=http://192.168.2.2:9000 s3 ls s3://velero

Don't be suprised. It is empty. :)

Prerequisites for Velero

Next we need to create a S3 credentials file .credentials-velero with the following content:

cat > .credentials-velero << EOF
[default]
aws_access_key_id = B9A6AC13381DAC41BAD
aws_secret_access_key = <your secret key>
EOF

Velero will use this to authenticate against the S3 bucket. In order to interact with Velero we need to download the Velero binary. It is available for multiple platforms in form of a tar file.

Latest release can be found here: https://github.com/vmware-tanzu/velero/releases/latest

At the time of writing this is v1.95.

wget https://github.com/vmware-tanzu/velero/releases/download/v1.95/velero-v1.95-linux-amd64.tar.gz

Extract it and move the "velero" binary to the /usr/local/bin folder

tar xvzf velero-v1.9.5-linux-amd64.tar.gz
sudo mv velero-v1.9.5-linux-amd64/velero /usr/local/bin/

Velero expects a kubeconfig file with ClusterAdmin privilege. So it is best to do this on a machine where kubectl is already installed and configured. If not please install and configure kubectl. A very simple and convenient way is to use Arkade. In case of K3S we need to set the path to the kubeconfig file

export KUBECONFIG=/etc/rancher/k3s/k3s.yaml

Install Velero

Now we are ready to install Velero. For this we add the Velero HELM repo and install the chart:

helm repo add vmware-tanzu https://vmware-tanzu.github.io/helm-charts
helm repo update
helm install velero vmware-tanzu/velero \
--namespace velero \
--create-namespace \
--set-file credentials.secretContents.cloud=./.credentials-velero \
--set configuration.provider=aws \
--set configuration.backupStorageLocation.name=default \
--set configuration.backupStorageLocation.bucket=velero \
--set configuration.backupStorageLocation.config.region=None \
--set configuration.backupStorageLocation.config.s3ForcePathStyle=true \
--set configuration.backupStorageLocation.config.s3Url=http://192.168.2.2:9000 \
--set configuration.defaultVolumesToFsBackup=true \
--set snapshotsEnabled=false \
--set deployNodeAgent=true \
--set initContainers[0].name=velero-plugin-for-aws \
--set initContainers[0].image=velero/velero-plugin-for-aws:latest \
--set initContainers[0].imagePullPolicy=IfNotPresent \
--set initContainers[0].volumeMounts[0].mountPath=/target \
--set initContainers[0].volumeMounts[0].name=plugins

Now monitor the pod deployment

[vagrant@rocky8-k3s ~]$ kubectl get pods -n velero
NAME                      READY   STATUS    RESTARTS   AGE
node-agent-tvssj          1/1     Running   0          50s
velero-576fb78ffb-747p9   1/1     Running   0          50s

If for whatever reason the pods are not starting or working as expected, check the logs

kubectl logs deployment/velero -n velero

If everything is up and running, verify your client and server version match

[vagrant@rocky8-k3s ~]$ velero version
Client:
        Version: v1.9.5
        Git commit: 2b5281f38aad2527f95b55644b20fb169a6702a7
Server:
        Version: v1.10.0
# WARNING: the client version does not match the server version. Please update client

In my case the client version didn't match. For some reason the Github latest link is not pointing to the latest release. So I quickly installed the v1.10.0 version

wget https://github.com/vmware-tanzu/velero/releases/download/v1.10.0/velero-v1.10.0-linux-amd64.tar.gz
tar xvzf velero-v1.10.0-linux-amd64.tar.gz
sudo mv velero-v1.10.0-linux-amd64/velero /usr/local/bin/

And now it is all fine

[vagrant@rocky8-k3s ~]$ velero version
Client:
        Version: v1.10.0
        Git commit: 367f563072659f0bcd809bc33507fd75cd722344
Server:
        Version: v1.10.0

To see the Velero configuration we can simply use HELM

helm get values -n velero velero

To see what was actually deployed we can run

kubectl get all -n velero

Example

[vagrant@rocky8-k3s ~]$ kubectl get all -n velero
NAME                          READY   STATUS    RESTARTS   AGE
pod/node-agent-tvssj          1/1     Running   0          12m
pod/velero-576fb78ffb-747p9   1/1     Running   0          12m

NAME             TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
service/velero   ClusterIP   10.43.179.220   <none>        8085/TCP   12m

NAME                        DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/node-agent   1         1         1       1            1           <none>          12m

NAME                     READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/velero   1/1     1            1           12m

NAME                                DESIRED   CURRENT   READY   AGE
replicaset.apps/velero-576fb78ffb   1         1         1       12m

Now with Velero completely installed, we can start by running our first backup job

Run a backup job

We can either run a backup job to backup all namespaces

velero backup create all --wait

or to backup an individual namespace

velero backup create ghost --include-namespaces ghost --wait

by pressing "CTRL-C" or omitting the "--wait", the job will run in the background.

[vagrant@rocky8-k3s ~]$ velero backup create ghost --include-namespaces ghost --wait
Backup request "ghost" submitted successfully.
Waiting for backup to complete. You may safely press ctrl-c to stop waiting - your backup will continue in the background.
..
Backup completed with status: Completed. You may check for more information using the commands `velero backup describe ghost` and `velero backup logs ghost`.
[vagrant@rocky8-k3s ~]$ velero backup create all --wait
Backup request "all" submitted successfully.
Waiting for backup to complete. You may safely press ctrl-c to stop waiting - your backup will continue in the background.
........
Backup completed with status: Completed. You may check for more information using the commands `velero backup describe all` and `velero backup logs all`.

Once the jobs are completed, we can run additional commands to get more details.

velero backup describe all

To get even more details

velero backup describe all --details

This can be very helpful in case a job fails or shows a warning. In my case both jobs were successful.

Make sure to check the output for lines like:

v1/PersistentVolume:
    - pvc-7d4d2a56-e6b2-4fb4-a373-8116de67cae3
    - pvc-95eb084e-6ddf-43d5-a22b-4ade78e0fd88
    - pvc-c9e5facc-d414-4264-9d9c-7834a9446323
  v1/PersistentVolumeClaim:
    - ghost/data-roksblog-mysql-0
    - ghost/roksblog-ghost
    - plausible/data-plausible-analytics-postgresql-0

This tells us that the persistent volumes and claims have been successfully backed up.

To list all backups run

velero backup get

Example

[vagrant@rocky8-k3s ~]$ velero backup get
NAME    STATUS      ERRORS   WARNINGS   CREATED                         EXPIRES   STORAGE LOCATION   SELECTOR
all     Completed   0        0          2022-12-27 22:17:56 +0000 UTC   29d       default            <none>
ghost   Completed   0        0          2022-12-27 22:17:44 +0000 UTC   29d       default            <none>

By default a retention policy of 30 days is applied on each backup. This means that after 30 days the backup will be automatically deleted! You can adjust this by adding the parameter "-ttl" to the backup command. The folllowing would store the backups for one week:

velero backup create ghost-ttl7d --include-namespaces ghost --wait --ttl 168h0m0s

If we now list the contents of our S3 bucket, we will see this:

[vagrant@rocky8-k3s ~]$ aws --profile=truenas --endpoint=http://192.168.2.2:9000 s3 ls s3://velero
                           PRE backups/

And in the MinIO console

Now that we have successfully taken a backup, let's destroy our Ghost deployment and restore it.

Restore a backup

First we uninstall our Ghost HELM chart

[vagrant@rocky8-k3s ~]$ helm list -n ghost
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /etc/rancher/k3s/k3s.yaml
WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /etc/rancher/k3s/k3s.yaml
NAME            NAMESPACE       REVISION        UPDATED                                 STATUS          CHART           APP VERSION
roksblog        ghost           8               2022-12-27 09:57:45.120724689 +0000 UTC deployed        ghost-19.1.52   5.26.3
[vagrant@rocky8-k3s ~]$ helm uninstall roksblog -n ghost
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /etc/rancher/k3s/k3s.yaml
WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /etc/rancher/k3s/k3s.yaml
release "roksblog" uninstalled

We will also delete the namespace

[vagrant@rocky8-k3s ~]$ kubectl delete ns ghost
namespace "ghost" deleted

Now we start the restore

velero restore create --from-backup ghost

To check the status of the restore we run

velero restore describe ghost-20221227223404

Example

[vagrant@rocky8-k3s ~]$ velero restore describe ghost-20221227223404
Name:         ghost-20221227223404
Namespace:    velero
Labels:       <none>
Annotations:  <none>

Phase:                       Completed
Total items to be restored:  43
Items restored:              43

Started:    2022-12-27 22:34:04 +0000 UTC
Completed:  2022-12-27 22:34:05 +0000 UTC

Backup:  ghost

Namespaces:
  Included:  all namespaces found in the backup
  Excluded:  <none>

Resources:
  Included:        *
  Excluded:        nodes, events, events.events.k8s.io, backups.velero.io, restores.velero.io, resticrepositories.velero.io, csinodes.storage.k8s.io, volumeattachments.storage.k8s.io, backuprepositories.velero.io
  Cluster-scoped:  auto

Namespace mappings:  <none>

Label selector:  <none>

Restore PVs:  auto

Existing Resource Policy:   <none>

Preserve Service NodePorts:  auto

As we can see it is restoring the pods

kubectl get pods -n ghost                                                                                                                                 rocky8-k3s.fritz.box: Tue Dec 27 22:57:08 2022
NAME                              READY   STATUS    RESTARTS   AGE
roksblog-ghost-686bc9d555-9zb29   0/1     Running   0          11s
roksblog-mysql-0                  0/1     Running   0          11s

Also HELM is listing our chart as installed again

[vagrant@rocky8-k3s ~]$ helm list -n ghost
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /etc/rancher/k3s/k3s.yaml
WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /etc/rancher/k3s/k3s.yaml
NAME            NAMESPACE       REVISION        UPDATED                                 STATUS          CHART           APP VERSION
roksblog        ghost           1               2022-12-27 22:55:02.709848568 +0000 UTC deployed        ghost-19.1.52   5.26.3

And we can browse it just fine

Schedule Backup jobs

We can easily create multiple cron style backup schedules by running:

# Daily Backups. Run the daily backup every day at 09:30. Please do note that the system runs on UTC time!!!
velero schedule create daily --schedule="30 09 * * *"
 
# Weekly Backups. Run the following backup job every Sunday at 10:30. Please do note that the system runs on UTC time!!!
velero schedule create weekly --schedule="30 10 * * 0" --include-cluster-resources=true

# Monthly Backups. Run the following backup job every 1st of the month at 20:30. Please do note that the system runs on UTC time!!! Keep the backups for 1 year.
velero schedule create monthly --schedule="30 22 1 * *" --include-cluster-resources=true --ttl 8064h0m0s

I recomend to take daily, weekly and monthly backups with individual retention policies. It is also advisable to backup all cluster resources at least once a week for DR purposes. In case you need some help for the cron schedules, check out crontab.guru.

And of course we can also lookup our schedules:

velero get schedule

Example

[vagrant@rocky8-k3s ~]$ velero get schedule
NAME      STATUS    CREATED                         SCHEDULE      BACKUP TTL   LAST BACKUP   SELECTOR   PAUSED
daily     Enabled   2022-12-28 09:05:40 +0000 UTC   30 09 * * *   0s           n/a           <none>     false
weekly    Enabled   2022-12-28 09:05:40 +0000 UTC   30 10 * * 0   0s           n/a           <none>     false
monthly   Enabled   2022-12-28 09:22:02 +0000 UTC   30 22 1 * *   8064h0m0s    n/a           <none>     false

This shows that Velero is very simple to install, configure and use. And of course you can also use this for disaster recovery. Your K8s cluster is broken? No problem. Spin up a new cluster, just follow the very same steps to install Velero, point to your S3 storage and restore your backups. I will test this in one of my next Blogposts. So stay tuned.