Backup and Restore TKG Clusters using Velero Standalone

In my last post, I discussed how Tanzu Mission Control Data Protection can be used to backup and restore stateful Kubernetes applications. In this tutorial, I’ll show you how to backup and restore applications using Velero standalone.

Why Use Velero Standalone?

You might wonder why you need to use Velero standalone when TMC Data Protection exists to make the process of backing up and restoring K8 applications easier. The answer is pretty simple. Currently, TMC does not have the ability to restore data across clusters. Backup and restore are limited to a single cluster. This is not an ideal solution from a business continuity point of view.

You may simply circumvent this situation if you use Velero standalone. In this approach, Velero is installed separately in both the source and target clusters. Both clusters have access to the S3 bucket where backups will be kept. If your source cluster is completely lost due to a disaster, you can redeploy the applications by downloading the backup from S3 and then restoring it.

When you can use Velero?

Velero is more than just a backup solution. In the following scenarios, velero can be used.

  1. Back up your cluster and restore it in case of loss.
  2. Recover from disaster.
  3. Copy cluster resources to other clusters.
  4. Replicate your production environment to create development and testing environments.
  5. Take a snapshot of your application’s state before upgrading a cluster.

In this tutorial, I will be demonstrating the first use case.

For the purpose of demonstration, I am using the same Acme Fitness app, which I used in my last demo. Please see my previous post for instructions on installing and configuring the app. This application is running in a Tanzu Kubernetes cluster provisioned via Tanzu Mission Control. 

For storing backups, I am using an S3 bucket (acme-backup) provisioned in MinIO. Instructions for configuring MinIO are documented here.

Install Velero CLI

Velero CLI can be installed on a Linux jumpbox from where you can access your Tanzu Kubernetes clusters. Before installing the Velero CLI, please check the supported Velero and K8 versions from the interop matrix published here.

The instructions for installing Velero CLI are listed below.

Create MinIO Credentials Store

For integrating MinIO with Velero, you need to provide MinIO credentials using which Velero can interact with the S3 bucket. Create a file to store your MinIO credentials

Install Velero in Source Cluster

Login to the source Tanzu Kubernetes cluster and switch the context to the cluster where you want to enable Velero protection. An example is shown below

Run the following command to enable velero:

In the below example, replace the IP 172.19.10.3 and Port 9000 with values configured in your environment.

Note: For a full list of configurable values with velero, run the command: velero install –help

The following components will be installed in the velero namespace once you run the aforementioned commands.

Verify that all components in the velero namespace are in a running/ready state.

Note: If the pods are stuck in the ImagePullBackOff state, then follow the steps from the troubleshooting section to fix the problem.

You should be able to see the configured backup location now

Install Velero in the Target Cluster

Installing Velero in the target cluster follows the same processes as in the source cluster. Connect to the target cluster and change to the proper context before running the velero install command.

Verify that the velero in the target cluster is installed and pods are in the running state.

And the target cluster is also able to see the configured backup location.

Test Backup and Restore

Now that you’ve installed Velero in both the source and target clusters, you can test whether a backup from the source cluster can be restored in the target cluster.

Step 1: Connect to the source cluster and perform the backup. For this demonstration, I am performing the backup of a namespace called acme.

On running the command velero backup describe acme-backup, you can fetch additional details about the backup.

Step 2: Connect to the target cluster and verify that you are able to see the backup.

Step 3: Perform the restore

Run the following command to initiate the restore.

On running the command velero restore describe <restore-name>, you can fetch additional details about the restore operation.

On triggering the restore command, verify that the backed-up namespace appears in the target cluster.

And all items in the namespace are in running state.

And that’s it for the backup and restore demo using Velero standalone.

Troubleshooting Tips

I had a problem where the Velero and Restic pods would not initialize and would remain in the ImagePullBackOff state.

On checking the events of the stuck pods, I found I was hitting the docker rate limit issue.

To fix the problem, edit the velero deployment and restic app as shown below.

Search for the field image and replace velero/velero:v1.6.2 to projects.registry.vmware.com/velero/velero:v1.6.2. 

Velero Cleanup

To cleanup velero installation, use the following commands

And that’s it for this post. I hope you enjoyed reading this post. Feel free to share this on social media if it is worth sharing.

Leave a ReplyCancel reply