Backup, restore and migrate Kubernetes resources including state to another AKS cluster with Velero

Velero is a backup and restore solution for Kubernetes, that can be used to take backups and restore them in case of loss but also for migrating to other clusters.

Backup, restore and migrate Kubernetes resources including state to another AKS cluster with Velero

Velero (formerly known as Heptio Ark) is a backup and restore solution for Kubernetes, that does not only backup cluster resources, but also persistent volumes. It can be used to take backups and restore them in case of loss but also for migrating to other clusters. Velero can run with cloud provider managed or on-premises clusters. Here is how o use Velero with the Azure Kubernetes Service (AKS).

Prepare Backup Resources

When backing up to Azure, Velero needs a Blob Container within an Azure Storage Account to store the backups. Assuming, you want to create a new Storage Account for your backups, you can do that with the following commands.

# Prepare variables
TENANT_ID=...
SUBSCRIPTION_ID=...
SOURCE_AKS_RESOURCE_GROUP=MC_...
TARGET_AKS_RESOURCE_GROUP=MC_... # (optional, only needed if you want to migrate)
BACKUP_RESOURCE_GROUP=backups
BACKUP_STORAGE_ACCOUNT_NAME=velero$(uuidgen | cut -d '-' -f5 | tr '[A-Z]' '[a-z]')

# Create Azure Storage Account
az storage account create \
  --name $BACKUP_STORAGE_ACCOUNT_NAME \
  --resource-group $RESOURCE_GROUP \
  --sku Standard_GRS \
  --encryption-services blob \
  --https-only true \
  --kind BlobStorage \
  --access-tier Hot
  
 # Create Blob Container
 az storage container create \
   --name velero \
   --public-access off \
   --account-name $BACKUP_STORAGE_ACCOUNT_NAME

We also need to give Velero access to the Resource Groups, where our Backups and the AKS Resources to backup are located. AKS resources are located in a generated Resource Group, usually starting with MC_. For this, we create a Service Principal and give it the proper access rights.

# Create a Service Principal for RBAC
AZURE_CLIENT_SECRET=`az ad sp create-for-rbac \
  --name "velero" \
  --role "Contributor" \
  --query 'password' \
  -o tsv \
  --scopes /subscriptions/$SUBSCRIPTION_ID/resourceGroups/$BACKUP_RESOURCE_GROUP /subscriptions/$SUBSCRIPTION_ID/resourceGroups/$SOURCE_AKS_RESOURCE_GROUP /subscriptions/$SUBSCRIPTION_ID/resourceGroups/$TARGET_AKS_RESOURCE_GROUP`
  
AZURE_CLIENT_ID=`az ad sp list --display-name "velero" --query '[0].appId' -o tsv`

Install Velero

Velero consists of a server, that runs on your cluster and a command-line client, that runs locally. To get started with Velero, we need to install the command-line client locally first.

Once Velero is installed locally, we can start preparing the cluster server installation. For configuring Velero in the cluster, we need to create a credentials file with the Service Principal Client ID and Secret.

cat << EOF > ./credentials-velero
AZURE_SUBSCRIPTION_ID=${SUBSCRIPTION_ID}
AZURE_TENANT_ID=${TENANT_ID}
AZURE_CLIENT_ID=${AZURE_CLIENT_ID}
AZURE_CLIENT_SECRET=${AZURE_CLIENT_SECRET}
AZURE_RESOURCE_GROUP=${SOURCE_AKS_RESOURCE_GROUP}
AZURE_CLOUD_NAME=AzurePublicCloud
EOF

No we can kick off the installation of Velero in our cluster

velero install \
  --provider azure \
  --plugins velero/velero-plugin-for-microsoft-azure:v1.2.0 \
  --bucket velero \
  --secret-file ./credentials-velero \
  --backup-location-config resourceGroup=$BACKUP_RESOURCE_GROUP,storageAccount=$BACKUP_STORAGE_ACCOUNT_NAME \
  --snapshot-location-config apiTimeout=5m,resourceGroup=$BACKUP_RESOURCE_GROUP,incremental=true \
  --wait

Create a backup

Now we are ready to create a backup of our cluster.

velero backup create firstbackup

Restore a backup

To restore the backup, we can create a point Velero to a backup and ask it to restore its state.

velero restore create firstrestore --from-backup firstbackup

Migrate to another cluster

Velero is often used, to migrate Kubernetes resources between clusters. For that, Velero needs to be installed in both clusters and both Velero installations need to point to the same Azure Storage Account. When setting up Velero in a different cluster, make sure to edit the ./credentials-velero first.

cat << EOF > ./credentials-velero
AZURE_SUBSCRIPTION_ID=${SUBSCRIPTION_ID}
AZURE_TENANT_ID=${TENANT_ID}
AZURE_CLIENT_ID=${AZURE_CLIENT_ID}
AZURE_CLIENT_SECRET=${AZURE_CLIENT_SECRET}
AZURE_RESOURCE_GROUP=${TARGET_AKS_RESOURCE_GROUP} # <- This changed
AZURE_CLOUD_NAME=AzurePublicCloud
EOF

Now you can run the same velero install command from above against the second cluster. Once this is configured, backups gets shown and can be restored across both clusters.


What's next?

Once you have set up Velero to be able to backup and restore your clusters, you should consider running scheduled backups regularly. Running the following command creates a daily backup automatically for example.

velero create schedule NAME --schedule="@every 24h"

Also, you should look into the filter options like --include-namespaces XYZ, if you only want to create backups of specific namespaces or resource types.


Troubleshooting

If you get an error during the installation, you unfortunately cannot simply restart the installation. You have to delete the Backup Location and Snapshot Location first.

kubectl delete BackupStorageLocation default -n velero
kubectl delete VolumeSnapshotLocation default -n velero
kubectl delete secret cloud-credentials -n velero

☝️ Advertisement Block: I will buy myself a pizza every time I make enough money with these ads to do so. So please feed a hungry developer and consider disabling your Ad Blocker.