Backup and restore OVHcloud Managed Kubernetes cluster, namespace, and applications using TrilioVault for Kubernetes – Support Guides

Learn how to deploy TrilioVault for Kubernetes (or TVK) to your OVHcloud Managed Kubernetes Cluster, create backups, and recover from a backup if something goes wrong.

Introduction

You can back up your entire cluster by including multiple namespaces, or optionally choose a single namespace, label-based backups, Helm Release-based backups, or Operator-based backups.

Advantages of using Trilio:

Take full (or incremental) backups of all your namespaces or selective applications, and restore them in case of data loss.
Migrate from one cluster to another.
Helm Release backups are supported.
Backups of Operator-based application deployment are also supported.
Run pre- and post-hooks for backup and restore operations.
Web management console that allows you to inspect your backup/restore operations' states in detail (and many other features).
Define retention policies for your backups.
Application lifecycle (meaning TVK itself) can be managed via a dedicated TrilioVault Operator.
Velero integration (Trilio supports monitoring Velero backups, restores, and backup/snapshot locations via its web management console).

How TrilioVault for Kubernetes works

TVK follows a cloud-native architecture, meaning that it has several components that, put together, form the Control Plane and Data Plane layers. Everything is managed via CRDs, making it fully Kubernetes-native. What is nice about Trilio is the clear separation of concerns and how effectively it handles backup and restore operations.

Each TrilioVault application consists of a bunch of “Controllers” and the associated CRDs. Every time a CRD is created or updated, the responsible controller is notified and performs cluster reconciliation. Then, the controller in charge spawns Kubernetes jobs that perform the real operation (like backup, restore, etc) in parallel.

Control Plane consists of:

Target Controller: defines the storage backend (S3™ *, NFS, etc.) via specific CRDs
BackupPlan Controller: defines the components to back up, automated backups' schedule, retention strategy, etc, via specific CRDs
Restore Controller: defines restore operations via specific CRDs

Data Plane consists of:

Datamover Pods: responsible for transferring data between persistent volumes and backup media (or Target). TrilioVault works with Persistent Volumes (PVs) using the CSI interface. For each PV that needs to be backed up, an ephemeral Datamover Pod is created. After each operation finishes, the associated pod is destroyed.
Metamover Pods: responsible for transferring Kubernetes API objects' data to backup media (or Target). Metamover pods are ephemeral, just like the Datamover ones.

Understanding TrilioVault application scope

TrilioVault for Kubernetes works based on scope, meaning you can have a Namespaced or a Cluster type of installation.

A Namespaced installation allows you to back up and restore at the namespace level only. In other words, the backup is meant to protect a set of applications that are bound to a namespace that you own. This is how a “BackupPlan” and the corresponding Backup CRD work. You cannot mutate those CRDs in other namespaces - they must be created in the same namespace where the application to be backed up is located.

On the other hand, a Cluster type installation is not scoped or bound to any namespace or a set of applications. You define cluster type backups via the Cluster prefixed CRDs, like ClusterBackupPlan, ClusterBackup, etc. Cluster type backups are a little bit more flexible in the sense that you are not tied to a specific namespace or set of applications to back up and restore. You can perform backup/restore operations for multiple namespaces and applications at once, including PVs (you can also back up etcd databased content).

To make sure that the TVK application scope and rules are followed correctly, TrilioVault uses an Admission Controller. It intercepts and validates each CRD that you want to push for TVK, before it is actually created. In case the TVK application scope is not followed, the admission controller will reject CRD creation in the cluster.

Another important thing to consider is that a TVK License is application scope-specific. In other words, you need to generate one type of license for either a Namespaced or a Cluster type installation.

Namespaced vs Cluster TVK application scope - when to use one or the other?

Which one you should choose all depends on the use case. For example, a Namespaced scope is a more appropriate option when you don’t have access to the whole Kubernetes cluster - only to specific namespaces and applications.

In most cases, you want to protect only the applications tied to a specific namespace that you own.

On the other hand, a cluster scoped installation type works at the global level, meaning it can trigger backup/restore operations for any namespace or resource from a Kubernetes cluster (including PVs and the etcd database).

To summarize:

If you are a cluster administrator, then you will most likely want to perform cluster level operations via corresponding CRDs, like ClusterBackupPlan, ClusterBackup, ClusterRestore, etc.
If you are a regular user, then you will usually perform namespaced-only operations (application centric) via corresponding CRDs, like: BackupPlan, Backup, Restore, etc.

The application interface is very similar or uniform when comparing the two types: Cluster vs non-Cluster prefixed CRDs. So, if you’re familiar with one type, it’s pretty straightforward to use the counterpart.

For more information, please refer to the TVK CRDs official documentation.

Back up and restore workflow

Whenever you want to back up an application, you start by creating a BackupPlan (or ClusterBackupPlan) CRD, followed by a Backup (or ClusterBackup) object. Trilio Backup Controller is notified of the change and performs backup object inspection and validation (i.e., whether it is cluster backup, namespace backup, etc.). Then, it spawns worker pods (Metamover, Datamover) responsible for moving the actual data (Kubernetes metadata, PVs data) to the backend storage (or Target), such as OVHcloud Object Storage.

Similarly, whenever you create a Restore object, the “Restore Controller” is notified to restore from a Backup object. Then, Trilio Restore Controller spawns worker nodes (Metamover, Datamover), responsible for moving backup data out of the OVHcloud Object Storage (Kubernetes metadata, PVs data). Finally, the restore process is initiated from the backup object.

Trilio is ideal for the disaster recovery use case, as well as for “snapshotting” your application state before performing system operations on your cluster, such as upgrades. For more details on this topic, please visit the Trilio Features and Trilio Use Case official page.

After finishing this tutorial, you should be able to:

Configure OVHcloud Object Storage backend for Trilio to use.
Backup and restore your applications
Backup and restore your entire OVHcloud Managed Kubernetes Cluster.
Create scheduled backups for your applications.
Create retention policies for your backups.

Introduction
Requirements
Instructions
Conclusion

Requirements

To complete this tutorial, you need the following:

An OVHcloud Object Storage Container/Bucket and an Object Storage User with permission to access the Object Storage Container.
A Git client to clone the OVHcloud Docs repository.
Helm for managing TrilioVault Operator releases and upgrades.
Kubectl for Kubernetes interaction.
krew for installation of the preflight checks plugin.

Important Information: In order for TrilioVault to work correctly and to back up your PVCs, the OVHcloud Managed Kubernetes Cluster needs to be configured to support the Container Storage Interface (or CSI for short), and volumesnapshot Custom Resource Definitions should be deployed.

kubectl get crd | grep volumesnapshot

The output should look similar to this:

volumesnapshotclasses.snapshot.storage.k8s.io    2022-01-20T07:58:05Z
volumesnapshotcontents.snapshot.storage.k8s.io   2022-01-20T07:58:05Z
volumesnapshots.snapshot.storage.k8s.io          2022-01-20T07:58:06Z

Also, make sure that the CRD supports both the v1beta1 and the v1 API versions. You can run the command below to check the API version:

kubectl get crd volumesnapshots.snapshot.storage.k8s.io -o yaml

At the end of the CRD yaml, you should obtain an output similar to the one below, showing storedVersions as v1beta1 and v1:

...
- lastTransitionTime: "2022-01-20T07:58:06Z"
    message: approved in https://github.com/kubernetes-csi/external-snapshotter/pull/419
    reason: ApprovedAnnotation
    status: "True"
    type: KubernetesAPIApprovalPolicyConformant
  storedVersions:
  - v1beta1
  - v1

The user can then install the Hostpath CSI driver and create a storage class and a volume snapshot class. You can check the existing storage class using the following command:

kubectl get storageclass

The output should look similar to the output below (notice the provisioner is hostpath.csi.k8s.io if you have installed hostpath CSI driver):

NAME                        PROVISIONER                RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
csi-cinder-classic          cinder.csi.openstack.org   Delete          Immediate           true                   3d
csi-cinder-high-speed       cinder.csi.openstack.org   Delete          Immediate           true                   3d
csi-cinder-high-speed-gen2  cinder.csi.openstack.org   Delete          Immediate           true                   3d
csi-hostpath-sc (default)   hostpath.csi.k8s.io        Retain          Immediate           false                  2d

Users should run a preflight check to make sure all the prerequisites for the TVK are fulfilled to proceed safely with installation. Follow the TVK Preflight Checks page to install and run preflight through the krew plugin.

Instructions

Step 1 - Installing TrilioVault for Kubernetes

In this step, you will learn how to deploy TrilioVault for Kubernetes for OVHcloud Managed Kubernetes Cluster, and manage TVK installations via Helm. Backup data will be stored in the OVHcloud Object Storage bucket created earlier in the Requirements section.

TrilioVault for Kubernetes consists of TVK Operator and TVM application.

The TrilioVault Operator (installable via Helm), which also installs the TrilioVault Manager CRD and creates a tvm custom resource. TVK Operator handles the installation, post-configuration steps, and future upgrades of the Trilio application components.

Installing TrilioVault operator and manager using Helm

NOTE: This tutorial is using the Cluster installation type for the TVK application (applicationScope Helm value is set to “Cluster”). All examples from this tutorial rely on this type of installation to function properly.

Please follow the steps below to install TrilioVault via Helm:

First, clone the OVHcloud Docs Git repository and change the directory to your local copy:

git clone https://github.com/ovh/docs.git
cd docs/pages/platform/kubernetes-k8s/backup-and-restore-cluster-namespace-and-applications-with-trilio/

Next, add the TrilioVault Helm repository, and list the available charts:

helm repo add triliovault-operator http://charts.k8strilio.net/trilio-stable/k8s-triliovault-operator
helm repo update
helm search repo triliovault-operator

The output should look similar to the following:

NAME                                            CHART VERSION   APP VERSION     DESCRIPTION
triliovault-operator/k8s-triliovault-operator   2.9.3           2.9.3           K8s-TrilioVault-Operator is an operator designe...

The chart of interest is triliovault-operator/k8s-triliovault-operator, which will install TrilioVault for Kubernetes Operator on the cluster. You can run the helm install command to install the Operator, which will also install the TrilioVault Manager CRD. Install the TrilioVault for Kubernetes Operator using Helm:

TVK allows the user to alter the values to be used by the TVK Operator installation using the --set option. Check the detailed instructions in the One-click Installation page.

helm install triliovault-operator triliovault-operator/k8s-triliovault-operator --namespace tvk --create-namespace

Now, please check your TVK deployment:

helm ls -n tvk

The output should look similar to the following (STATUS column should display “deployed”):

NAME                    NAMESPACE       REVISION        UPDATED                                 STATUS          CHART                           APP VERSION
triliovault-manager-tvk tvk             1               2022-06-21 07:15:03.681891176 +0000 UTC deployed        k8s-triliovault-2.9.3           2.9.3
triliovault-operator    tvk             1               2022-06-21 07:13:18.731129339 +0000 UTC deployed        k8s-triliovault-operator-2.9.3  2.9.3

Next, verify that the TrilioVault-Operator and TrilioVault-Manager applications are up and running:

kubectl get deployments -n tvk

The output should look similar to the following (deployment pods must be in the Ready state):

NAME                                            READY   UP-TO-DATE   AVAILABLE   AGE
k8s-triliovault-admission-webhook               1/1     1            1           45d
k8s-triliovault-control-plane                   1/1     1            1           45d
k8s-triliovault-exporter                        1/1     1            1           45d
k8s-triliovault-ingress-nginx-controller        1/1     1            1           13d
k8s-triliovault-web                             1/1     1            1           45d
k8s-triliovault-web-backend                     1/1     1            1           45d
triliovault-operator-k8s-triliovault-operator   1/1     1            1           45d

Now, please check your triliovaultmanagers CRDs and the tvm CR as well:

kubectl get crd | grep trilio

The output should look similar to the following:

backupplans.triliovault.trilio.io                     2022-06-21T07:39:38Z
backups.triliovault.trilio.io                         2022-06-21T07:39:38Z
clusterbackupplans.triliovault.trilio.io              2022-06-21T07:39:39Z
clusterbackups.triliovault.trilio.io                  2022-06-21T07:39:39Z
clusterrestores.triliovault.trilio.io                 2022-06-21T07:39:39Z
hooks.triliovault.trilio.io                           2022-06-21T07:39:39Z
licenses.triliovault.trilio.io                        2022-06-21T07:39:39Z
policies.triliovault.trilio.io                        2022-06-21T07:39:40Z
restores.triliovault.trilio.io                        2022-06-21T07:39:40Z
targets.triliovault.trilio.io                         2022-06-21T07:39:40Z
triliovaultmanagers.triliovault.trilio.io             2022-06-21T07:38:30Z

You can also check if the TVM Custom Resource is created.

kubectl get triliovaultmanagers -n tvk

The output looks similar to the following:

NAME                  TRILIOVAULT-VERSION   SCOPE     STATUS     RESTORE-NAMESPACES
triliovault-manager   2.9.3                 Cluster   Deployed

If the output looks like the one above, you have installed TVK successfully. Next, you will learn how to check the license type and validity, as well as how to renew a license.

TrilioVault application licensing

By default, when installing TVK via Helm, there is no Free Trial license generated. This tutorial will help you install the ‘Cluster’ scoped license, which is of the type ‘Basic’ for a cluster capacity of 500 CPUs and an expiration time of five years.

You can always go to the Trilio website and generate a new license for your cluster that suits your needs.

Installing TVK application licensing

Please run the command below to see what license is available for your cluster (it is managed via the License CRD):

curl -LO https://raw.githubusercontent.com/ovh/docs/develop/pages/platform/kubernetes-k8s/backup-and-restore-cluster-namespace-and-applications-with-trilio/manifests/tvk_install_license.yaml
kubectl apply -f tvk_install_license.yaml -n tvk

Run the below command to verify if the license is successfully created for OVHcloud users:

kubectl get license -n tvk

The output looks similar to the following (notice the STATUS, which should be “Active”, as well as the license type in the EDITION column and EXPIRATION TIME):

NAMESPACE   NAME             STATUS   MESSAGE                                   CURRENT NODE COUNT   GRACE PERIOD END TIME   EDITION   CAPACITY   EXPIRATION TIME        MAX NODES
tvk         trilio-license   Active   Cluster License Activated successfully.   3                                            Basic     500        2027-06-21T00:00:00Z   3

The license is managed via a special CRD, namely the License object. You can inspect it by running the following command:

kubectl describe license test-license-1 -n tvk

The output looks similar to the following (notice the Message and Capacity fields, as well as the Edition):

Name:         test-license-1
Namespace:    tvk
Labels:       <none>
Annotations:  generation: 1
              triliovault.trilio.io/creator: kubernetes-admin
              triliovault.trilio.io/instance-id: 46188ee1-8ce1-4c45-96fa-c262f2214ced
              triliovault.trilio.io/updater:
                [{"username":"system:serviceaccount:tvk:k8s-triliovault","lastUpdatedTimestamp":"2022-06-21T10:06:59.796280418Z"}]
API Version:  triliovault.trilio.io/v1
Kind:         License
Metadata:
  Creation Timestamp:  2022-06-21T10:56:14Z
...
  Current Node Count:  3
  Max Nodes:           3
  Message:             Cluster License Activated successfully.
  Properties:
    Active:                        true
    Capacity:                      500
    Company:                       OVHCloud License For Users
    Creation Timestamp:            2022-06-21T00:00:00Z
    Edition:                       Basic
    Expiration Timestamp:          2027-06-21T00:00:00Z
    Kube UID:                      46188ee1-8ce1-4c45-96fa-c262f2214ced
    License ID:                    TVAULT-4ddf3f72-d2ab-11ec-9a22-4b4849af53ee
    Maintenance Expiry Timestamp:  2027-06-21T00:00:00Z
    Number Of Users:               -1
    Purchase Timestamp:            2022-06-21T00:00:00Z
    Scope:                         Cluster
...

The above output will also tell you when the license is going to expire in the Expiration Timestamp field, and the Scope (Cluster based in this case). You can opt for a cluster-wide license type or for a namespace based license. More details can be found on the Trilio Licensing documentation page.

Renewing TVK application license

To renew the license, you will have to request a new one from the Trilio website by navigating to the licensing page. After completing the form, you should receive the License YAML manifest, which can be applied to your cluster using kubectl. The following commands assume that TVK is installed in the default tvk namespace (please replace the <> placeholders accordingly, where required):

kubectl apply -f <YOUR_LICENSE_FILE_NAME>.yaml -n tvk

You can check the new license status as you already learned via:

# List available TVK licenses first from the `tvk` namespace
kubectl get license -n tvk

# Get information about a specific license from the `tvk` namespace
kubectl describe license <YOUR_LICENSE_NAME_HERE> -n tvk

In the next step, you will learn how to define the storage backend for TrilioVault to store backups, called a target.

Step 2 - Creating a TrilioVault target to store backups

TrilioVault needs to know first where to store your backups. TrilioVault refers to the storage backend by using the target term, and it is managed via a special CRD named Target. The following target types are supported: S3™ and NFS. For OVHcloud, and the purpose of the tutorial, it makes sense to rely on the S3™ storage type because it’s cheap and scalable. To benefit from an enhanced level of protection, you can create multiple target types (for both S3™ and NFS), so that your data is kept safe in multiple places, thus achieving backup redundancy.

OVHcloud provides two types of S3™ compatible Object Storage solutions:

To create a Target for the OVHcloud Object Storage using S3 Swift API, use this link.
To create a Target for the S3™ compatible Object Storage, use this link.

Create an Object Storage user in the tab next to the Object Storage Container. Now, from Users and Roles, assign the Administrator privileges to the Object Storage user.

Next, create an Access Key and Secret Key to access the Object Storage Container using the Getting Started with the Swift S3 API tutorial.

If you have created a container with High Performance, follow the Getting Started with Object Storage documentation.

Save the Access Key and Secret Key used in the AWS CLI ~/.aws/credentials file. It is required to create a target secret later. Take a note of the Object Storage endpoint URL s3.endpoint_url, and the region name region provided in the AWS CLI ~/.aws/config file. You will need this to create a Target later.

To access Object Storage, each target needs to know the bucket credentials. A Kubernetes Secret must be created as well:

apiVersion: v1
kind: Secret
metadata:
  name: trilio-ovh-s3-target-secret
  namespace: tvk
type: Opaque
stringData:
  accessKey: <YOUR_OVH_OBJECT_STORAGE_BUCKET_ACCESS_KEY_ID_HERE>   # value must be base64 encoded
  secretKey: <YOUR_OVH_OBJECT_STORAGE_BUCKET_SECRET_KEY_HERE>      # value must be base64 encoded

Notice that the secret name is trilio-ovh-s3-target-secret.

It’s referenced by the spec.objectStoreCredentials.credentialSecret field of the Target CRD, explained below. The secret can be in the same namespace where TrilioVault was installed (defaults to tvk), or in another namespace of your choice. Just make sure that you reference the namespace correctly. On the other hand, please make sure to protect the namespace where you store TrilioVault secrets viaRBAC, for security reasons.

A typical Target definition looks like this:

apiVersion: triliovault.trilio.io/v1
kind: Target
metadata:
  name: trilio-ovh-s3-target
  namespace: tvk
spec:
  type: ObjectStore
  vendor: Other                               # e.g. `AWS` for AWS S3® Storage and `Other` for OVHcloud Object Storage
  enableBrowsing: true
  objectStoreCredentials:
    bucketName: <YOUR_OVH_OBJECT_STORAGE_BUCKET_NAME_HERE>
    region: <YOUR_OVH_OBJECT_STORAGE_BUCKET_REGION_HERE>    # e.g.: `us-east-va` region for OVHcloud Object Storage or `us-est-1` etc for AWS S3®
    url:"https://s3.<REGION_NAME_HERE>.io.cloud.ovh.us".      # e.g.: `https://s3.us-east-va.io.cloud.ovh.us` for Object Storage Container in `us-east-va` region
    credentialSecret:
      name: trilio-ovh-s3-target-secret
      namespace: tvk
  thresholdCapacity: 10Gi

Explanation for the above configuration:

spec.type: Type of target for backup storage (Object Storage is an object store).
spec.vendor: Third-party storage vendor hosting the target (for OVHcloud Object Storage, you need to use “Other” instead of “AWS”).
spec.enableBrowsing: Enable browsing for the target to browse through the backups stored on it.
spec.objectStoreCredentials: Defines required credentials (via credentialSecret) to access the Object Storage, as well as other parameters such as bucket region and name.
spec.thresholdCapacity: Maximum threshold capacity to store backup data.

Steps to create a Target for TrilioVault:

First, change the directory where the ovh/docs Git repository was cloned on your local machine:

cd docs/pages/public_cloud/containers_orchestration/managed_kubernetes/backup-and-restore-cluster-namespace-and-applications-with-trilio/

Next, create the Kubernetes secret containing your target Object Storage bucket credentials (please replace the <> placeholders accordingly):

kubectl create secret generic trilio-ovh-s3-target-secret \
  --namespace=tvk \
  --from-literal=accessKey="<YOUR_OVH_OBJECT_STORAGE_BUCKET_ACCESS_KEY_HERE>"\
  --from-literal=secretKey="<YOUR_OVH_OBJECT_STORAGE_BUCKET_SECRET_KEY_HERE>"

Then, open and inspect the Target manifest file provided in the docs repository, using an editor of your choice (preferably with YAML lint support). You can use VS Code, for example:
```
cat manifests/triliovault-ovh-s3-target.yaml
```
Now, replace the <> placeholders according to your OVHcloud Object Storage Trilio bucket, like: bucketName, region, url, and credentialSecret.
Finally, save the manifest file and create the Target object using kubectl:
```
kubectl apply -f manifests/triliovault-ovh-s3-target.yaml
```

What happens next is that TrilioVault will spawn a worker job named trilio-ovh-s3-target-validator, responsible for validating your Object Storage bucket (like availability, permissions, etc.). If the job finishes successfully, the bucket is considered to be healthy or available, and the trilio-ovh-s3-target-validator job resource is deleted afterwards. If something bad happens, the Object Storage target validator job is left up and running so that you can inspect the logs and find the possible issue.

Now, please go ahead and check if the Target resource created earlier is healthy:

kubectl get target trilio-ovh-s3-target -n tvk

The output looks similar to the following (notice the STATUS column value - should be “Available”, meaning it’s in a healthy state):

NAME                   TYPE          THRESHOLD CAPACITY   VENDOR   STATUS      BROWSING ENABLED
trilio-ovh-s3-target   ObjectStore   10Gi                 Other    Available   Enabled

If the output looks like the above, then you have configured the Object Storage target object successfully.

Hint: In case the target object fails to become healthy, you can inspect the logs from the trilio-ovh-s3-target-validator Pod to find the issue.

First, you need to find the target validator.

kubectl get pods -n tvk | grep trilio-ovh-s3-target-validator

The output looks similar to this:

trilio-ovh-s3-target-validator-tio99a-6lz4q 1/1     Running     0          104s

Now, fetch the logs data.

kubectl logs pod/trilio-ovh-s3-target-validator-tio99a-6lz4q -n tvk

The output looks similar to this (notice the exception as an example):

...
INFO:root:2022-06-21 09:06:50.595166: waiting for mount operation to complete.
INFO:root:2022-06-21 09:06:52.595772: waiting for mount operation to complete.
ERROR:root:2022-06-21 09:06:54.598541: timeout exceeded, not able to mount within time.
ERROR:root:/triliodata is not a mountpoint. We can't proceed further.
Traceback (most recent call last):
  File "/opt/tvk/datastore-attacher/mount_utility/mount_by_target_crd/mount_datastores.py", line 56, in main
    utilities.mount_datastore(metadata, datastore.get(constants.DATASTORE_TYPE), base_path)
  File "/opt/tvk/datastore-attacher/mount_utility/utilities.py", line 377, in mount_datastore
    mount_s3_datastore(metadata_list, base_path)
  File "/opt/tvk/datastore-attacher/mount_utility/utilities.py", line 306, in mount_s3_datastore
    wait_until_mount(base_path)
  File "/opt/tvk/datastore-attacher/mount_utility/utilities.py", line 328, in wait_until_mount
    base_path))
Exception: /triliodata is not a mountpoint. We can't proceed further.
...

Next, you will discover the TVK web console, which is a really nice and useful addition, to help you manage backup and restore operations very easily.

Step 3 - Getting to know the TVK Web management console

While you can manage backup and restore operations from the CLI entirely via kubectl and CRDs, TVK provides a Web Management Console to accomplish the same operations via the GUI. The management console simplifies common tasks via point-and-click operations, provides better visualization and inspection of TVK cluster objects, and creates disaster recovery plans (DRPs).

The Helm based installation covered in Step 1 - Installing TrilioVault for Kubernetes already took care of installing the required components for the web management console.

Getting access to the TVK Web management console

To access the console and explore its features, you can use either the Load Balancer or the Node Port, or you can port forward the ingress-nginx-controller service for TVK.

First, you need to identify the ingress-nginx-controller service from the tvk namespace:

kubectl get svc -n tvk

The output looks similar to the following (search for the k8s-triliovault-ingress-nginx-controller line, and notice that it listens on port 80 in the PORT(S) column):

NAME                                                            TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                      AGE
k8s-triliovault-admission-webhook                               ClusterIP   10.3.241.124   <none>        443/TCP                      45d
k8s-triliovault-ingress-nginx-controller                        NodePort    10.3.183.125   <none>        80:31879/TCP,443:31921/TCP   13d
k8s-triliovault-ingress-nginx-controller-admission              ClusterIP   10.3.20.89     <none>        443/TCP                      13d
k8s-triliovault-web                                             ClusterIP   10.3.56.86     <none>        80/TCP                       45d
k8s-triliovault-web-backend                                     ClusterIP   10.3.236.30    <none>        80/TCP                       45d
triliovault-operator-k8s-triliovault-operator-webhook-service   ClusterIP   10.3.8.249     <none>        443/TCP                      45d

If you are using the Load Balancer for the ingress-nginx-controller, then the output would look like:

NAME                                                            TYPE            CLUSTER-IP     EXTERNAL-IP      PORT(S)                      AGE
k8s-triliovault-admission-webhook                               ClusterIP       10.3.241.124   <none>           443/TCP                      45d
k8s-triliovault-ingress-nginx-controller                        LoadBalancer    10.3.183.125   51.222.45.171    80:31879/TCP,443:31921/TCP   13d
k8s-triliovault-ingress-nginx-controller-admission              ClusterIP       10.3.20.89     <none>           443/TCP                      13d
k8s-triliovault-web                                             ClusterIP       10.3.56.86     <none>           80/TCP                       45d
k8s-triliovault-web-backend                                     ClusterIP       10.3.236.30    <none>           80/TCP                       45d
triliovault-operator-k8s-triliovault-operator-webhook-service   ClusterIP       10.3.8.249     <none>           443/TCP                      45d

TVK is using an Nginx Ingress Controller to route traffic to the management web console services. Routing is host based, and the host name is ovh-k8s-tvk.demo.trilio.io as defined in the Helm values file from the ovh/docs:

# The host name to use when accessing the web console via the TVK ingress controller service
ingressConfig:
  host: "ovh-k8s-tvk.demo.trilio.io"

Having the above information on hand, please go ahead and edit the/etc/hosts file, and add this entry:

127.0.0.1 ovh-k8s-tvk.demo.trilio.io

Next, create the port forward for the TVK ingress controller service:

kubectl port-forward svc/k8s-triliovault-ingress-nginx-controller 8080:80 -n tvk

Finally, download the kubeconfig file for your OVHcloud Managed Kubernetes Cluster present under the Service tab as Kubeconfig file. This step is required so that the web console can authenticate you using the kubeconfig file:

After following the above steps, you can access the console in your web browser by navigating to: http://ovh-k8s-tvk.demo.trilio.io. When asked for the kubeconfig file, please select the one that you created in the last command from above.

Please keep the generated kubeconfig file safe because it contains sensitive data.

Exploring the TVK Web Console user interface

The home page looks similar to:

Go ahead and explore each section from the left:

Cluster Management: This shows the list, including the primary cluster and other clusters having TVK instances, added to the primary OVHcloud cluster using the Multi-Cluster Management feature.
Backup & Recovery: This is the main dashboard, which gives you a general overview of the whole cluster (e.g., Discovered namespaces, Applications, Backup plans list, Targets, Hooks, Policies, etc.)
- Namespaces:
- Applications:
- backup plans:
- Targets:
- Scheduling Policy:
- Retention Policy:
Monitoring: This has two options: TrilioVault Monitoring and Velero Monitoring, if the user has Velero configured on their OVHcloud cluster.
- TrilioVault Monitoring: It shows the backup and restore summary of the Kubernetes cluster.
- Velero Monitoring:
Disaster Recovery: Allows you to manage and perform disaster recovery operations.

You can also see the Object Storage Target created earlier, by navigating to Backup & Recovery > Targets > Select the TVK Namespace from the dropdown on the top (in case of ovh/docs the TVK Namespace is tvk):

Going further, you can browse the target and list the available backups by clicking the Actions button on the right. Then, select the Launch Browser option from the pop-up menu (for this to work, the target must have the enableBrowsing flag set to true):

For more information and available features, please consult the TVK Web Management Console User Interface official documentation.

Next, you will learn how to perform backup and restore operations for specific use cases, like:

Specific namespace(s) backup and restore.
Whole cluster backup and restore.

Step 4 - Helm release backup and restore example

In this step, you will learn how to create a one-time backup for an entire Helm release from your OVHcloud Managed Kubernetes Cluster and restore it afterwards, making sure that all the resources related to the Helm release are recreated. The namespace in question is demo-backup-ns. TVK has a neat feature that allows you to perform backups at a higher level than just Helm releases, meaning complete namespaces, Label based application, and Operator based application. You will learn how to accomplish such a task in the steps to follow.

Next, you will perform the following tasks:

Create the demo-backup-ns namespace and create a mysql-qa helm release for the MySQL Database
Perform a namespace backup via Backup Plan and Backup CRDs.
Delete the mysql-qa Helm release.
Restore the mysql-qa Helm release via Restore CRD.
Check the mysql-qa Helm release resources restoration.

Creating mysql-qa helm release

helm repo add stable https://charts.helm.sh/stable
helm repo update
helm install mysql-qa --set mysqlRootPassword=triliopass stable/mysql -n demo-backup-ns

To verify if the Helm release is deployed correctly, run the following command:

helm ls -n demo-backup-ns

The output should look similar to this:

NAME            NAMESPACE       REVISION        UPDATED                                 STATUS          CHART           APP VERSION
mysql-qa        demo-backup-ns  1               2022-06-21 08:23:01.849247691 +0000 UTC deployed        mysql-1.6.9     5.7.30

Next, verify that the mysql-qa deployment is up and running:

kubectl get deployments -n demo-backup-ns

The output should look similar to this:

NAME       READY   UP-TO-DATE   AVAILABLE   AGE
mysql-qa   1/1     1            1           2m5s

This shows that the mysql-qa Helm release is ready to be backed up.

Creating mysql-qa Helm release backup

To perform backups for a single application at the namespace level (or Helm release), a Backup Plan followed by a Backup CRD is required. A Backup Plan allows you to:

Specify a target where backups should be stored.
Define a set of resources to back up (e.g., namespace or Helm releases).
Set up encryption if you want to encrypt your backups on the target (this is a very nice feature for securing your backups' data).
Define schedules for full or incremental type backups.
Define retention policies for your backups.

TrilioVault for Kubernetes has created a few sample scheduling and retention policies for users. Users can create new policies or utilize the sample policies.

In other words, a Backup Plan is a definition of the “what,” “where,” “to," and “how” of the backup process, but it doesn’t perform the actual backup. The Backup CRD is responsible for triggering the actual backup process, as dictated by the Backup Plan spec.

Typical Backup Plan CRD looks like the example below:

apiVersion: triliovault.trilio.io/v1
kind: BackupPlan
metadata:
  name: mysql-qa-helm-release-backup-plan
  namespace: demo-backup-ns
spec:
  backupConfig:
    target:
      name: trilio-ovh-s3-target
      namespace: tvk
  backupPlanComponents:
    helmReleases:
      - mysql-qa

Explanation for the above configuration:

spec.backupConfig.target.name: Tells TVK what target name to use for storing backups.
spec.backupConfig.target.namespace: Tells TVK in which namespace the target was created.
spec.backupComponents: Defines a list of resources to back up (can be namespaces or Helm releases).

Typical Backup CRD looks like the example below:

apiVersion: triliovault.trilio.io/v1
kind: Backup
metadata:
  name: mysql-qa-helm-release-full-backup
  namespace: demo-backup-ns
spec:
  type: Full
  backupPlan:
    name: mysql-qa-helm-release-backup-plan
    namespace: demo-backup-ns

Explanation for the above configuration:

spec.type: Specifies the backup type (e.g., Full or Incremental).
spec.backupPlan: Specifies the BackupPlan which this Backup should use.

Steps to initiate the mysql-qa Helm release one-time backup:

First, make sure that the mysql-qa is deployed in your cluster by following these steps.

Next, change the directory where the docs Git repository was cloned on your local machine:

cd docs/pages/platform/kubernetes-k8s/backup-and-restore-cluster-namespace-and-applications-with-trilio/

Then, open and inspect the mysql-qa Helm release Backup Plan and Backup manifest files provided in the pages/platform/kubernetes-k8s/backup-and-restore-cluster-namespace-and-applications-with-trilio/guide.en-us.md repository, using an editor of your choice (preferably with YAML lint support). You can use VS Code, for example:
```
cat manifests/mysql-qa-helm-release-backup-plan.yaml
cat manifests/mysql-qa-helm-release-backup.yaml
```

Create the Backup Plan resource using kubectl:

kubectl apply -f manifests/mysql-qa-helm-release-backup-plan.yaml -n demo-backup-ns

Now, inspect the Backup Plan status (targeting the mysql-qa Helm release) using kubectl:

kubectl get backupplan mysql-qa-helm-release-backup-plan -n demo-backup-ns

The output should look similar to the following (notice the STATUS column value, which should be set to “Available”):

NAME                                  TARGET            ...   STATUS
mysql-qa-helm-release-backup-plan   trilio-ovh-s3-target    ...   Available

Finally, create a Backup resource using kubectl:

kubectl apply -f manifests/mysql-qa-helm-release-backup.yaml -n demo-backup-ns

Now, inspect the Backup status (targeting the mysql-qa Helm release) using kubectl:

kubect get backup mysql-qa-helm-release-full-backup -n demo-backup-ns

Next, check the Backup object status using kubectl:

kubectl get backup mysql-qa-helm-release-full-backup -n demo-backup-ns

The output should look similar to the following (notice the STATUS column value, which should be set to “InProgress”, as well as the BACKUP TYPE set to “Full”):

NAME                                BACKUPPLAN                          BACKUP TYPE   STATUS       ...
mysql-qa-helm-release-full-backup   mysql-qa-helm-release-backup-plan   Full          InProgress   ...

After all the mysql-qa Helm release components finish uploading to the Object Storage target, you should get the following results:

kubectl get backup mysql-qa-helm-release-full-backup -n demo-backup-ns

The output should look similar to the following (notice that the STATUS changed to “Available”, and PERCENTAGE is “100”)

NAME                                BACKUPPLAN                          BACKUP TYPE   STATUS      ...   PERCENTAGE
mysql-qa-helm-release-full-backup   mysql-qa-helm-release-backup-plan   Full          Available   ...   100

If the output looks like this, you have successfully backed up the mysql-qa Helm release. You can go ahead and see how TrilioVault stores Kubernetes metadata by listing the TrilioVault Object Storage Bucket contents.

Finally, you can check that the backup is available in the web console as well by navigating to Backup & Recovery -> Backup Plans and selecting the demo-ns-backup Namespace from the top drop-down menu (notice that it’s in the “Available” state, and the mysql-qa Helm release was backed up in the “Component Details” sub-view).

Deleting mysql-qa Helm release and resources

Now, go ahead and simulate a disaster by intentionally deleting the mysql-qa Helm release:

helm delete mysql-qa -n demo-backup-ns

Next, check that the namespace resources were deleted (listing should be empty):

kubectl get all -n demo-backup-ns

Restoring mysql-qa Helm release backup

Important notes:

If restoring into the same namespace, ensure that the original application components have been removed. Especially ensure that the PVC of the application is deleted.
If restoring to another cluster (migration scenario), ensure that TrilioVault for Kubernetes is running in the remote namespace/cluster as well. To restore into a new cluster (where the Backup CR does not exist), source.type must be set to location. Please refer to the Custom Resource Definition Restore Section to view a restore by location example.
When you delete the demo-backup-ns namespace, the Load Balancer resource associated with the mysql-qa service will be deleted as well. So, when you restore the mysq-qa service, the Load Balancer will be recreated by OVHcloud. The issue is that you will get a New IP address for your Load Balancer, so you will need to adjust the A records to route traffic into the domains hosted on the cluster.

To restore a specific Backup, you need to create a Restore CRD. A typical Restore CRD looks like this:

apiVersion: triliovault.trilio.io/v1
kind: Restore
metadata:
  name: mysql-qa-helm-release-restore
  namespace: demo-restore-ns
spec:
  source:
    type: Backup
    backup:
      name: mysql-qa-helm-release-full-backup
      namespace: demo-backup-ns
  skipIfAlreadyExists: true

Explanation for the above configuration:

spec.source.type: Specifies what backup type to restore from.
spec.source.backup: Contains a reference to the backup object to restore from.
spec.skipIfAlreadyExists: Specifies whether to skip the restore of a resource if it already exists in the namespace being restored.

Restore allows you to restore the last successful Backup for an application. It is used to restore a single namespace or Helm release protected by the Backup CRD. The Backup CRD is identified by its name: mysql-qa-helm-release-full-backup.

First, inspect the Restore CRD example from the ovh/docs Git repository:

cat manifests/mysql-qa-helm-release-restore.yaml

Then, create the Restore resource using kubectl:

kubectl apply -f manifests/mysql-qa-helm-release-restore.yaml

Finally, inspect the Restore object status:

kubectl get restore mysql-qa-helm-release-restore -n demo-restore-ns

The output should look similar to the following (notice the STATUS column set toCompleted, as well as the PERCENTAGE COMPLETED set to100):

NAME                            STATUS      DATA SIZE   START TIME             END TIME               PERCENTAGE COMPLETED   DURATION
mysql-qa-helm-release-restore   Completed   0           2022-06-21T15:06:52Z   2022-06-21T15:07:35Z   100                    43.524191306s

If the output looks like the one above, then the mysql-qa Helm release restoration process has completed successfully.

Verifying application integrity after restoration

Check that all the demo-restore-ns namespace resources are in place and running:

kubectl get all -n demo-restore-ns

The output looks similar to:

NAME                                                           READY   STATUS      RESTARTS   AGE
pod/mysql-qa-665f6fb548-m8tnd                                  1/1     Running     0          91m
pod/mysql-qa-helm-release-full-backup-metamover-9w7s0y-x8867   0/1     Completed   0          9m2s

NAME               TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)    AGE
service/mysql-qa   ClusterIP   10.3.227.118   <none>        3306/TCP   91m

NAME                       READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/mysql-qa   1/1     1            1           91m

NAME                                  DESIRED   CURRENT   READY   AGE
replicaset.apps/mysql-qa-665f6fb548   1         1         1       91m

NAME                                                           COMPLETIONS   DURATION   AGE
job.batch/mysql-qa-helm-release-full-backup-metamover-9w7s0y   1/1           28s        9m2s

The next step deals with whole cluster backup and restore, thus covering a disaster recovery scenario.

Step 5 - Backup and restore whole cluster example

In this step, you will simulate a disaster recovery scenario. The whole OVHcloud Managed Kubernetes Cluster will be deleted, and then the important applications will be restored from a previous backup.

Next, you will perform the following tasks:

Create the multi-namespace backup, using a Cluster Backup Plan CRD that targets all important namespaces from your OVHcloud Managed Kubernetes Cluster.
Delete the OVHcloud Managed Kubernetes Cluster, using the OVHcloud Control Panel.
Create a new OVHcloud Managed Kubernetes Cluster, using the OVHcloud Control Panel.
Re-install TVK and configure the OVHcloud Object Storage bucket as Object Storage target (you’re going to use the same Object Storage bucket, where your important backups are stored)
Restore all the important applications by using the TVK web console.
Check the OVHcloud Managed Kubernetes Cluster application integrity.

Creating the OVHcloud Managed Kubernetes cluster backup using the TVK Multi-Namespace backup feature

The main idea here is to perform an OVHcloud Managed Kubernetes Cluster backup by including all important namespaces that hold your essential applications and configurations. Basically, we cannot name it a full cluster backup and restore, but rather a multi-namespace backup and restore operation. In practice, this is all that’s needed because everything is “namespaced” in Kubernetes. You will also learn how to perform a cluster restore operation via location from the target. The same flow applies when you need to perform cluster migration.

A typical Cluster Backup Plan manifest targeting multiple namespaces looks like this:

apiVersion: triliovault.trilio.io/v1
kind: ClusterBackupPlan
metadata:
  name: ovh-multi-ns-backup-plan
  namespace: default
spec:
  backupConfig:
    target:
      name: trilio-ovh-s3-target
      namespace: default
  backupComponents:
    - namespace: default
    - namespace: demo-backup-ns
    - namespace: backend
    - namespace: monitoring

Notice that kube-system (or other OVHcloud Managed Kubernetes Cluster related namespaces) is not included in the list. Usually, this is not required unless there is a special case requiring some settings to be persisted at that level.

A typical Cluster Backup manifest targeting multiple namespaces looks like this:

apiVersion: triliovault.trilio.io/v1
kind: ClusterBackup
metadata:
  name: multi-ns-backup
  namespace: default
spec:
  type: Full
  clusterBackupPlan:
    name: ovh-multi-ns-backup-plan
    namespace: default

Steps to initiate a backup for all of the important namespaces in your OVHcloud Managed Kubernetes Cluster:

First, go to the directory where the ovh/docs Git repository was cloned on your local machine:
```
cd docs
```
Then, open and inspect the Cluster Backup Plan and Cluster Backup manifest files provided in the docs repository.
```
cat manifests/multi-ns-backup-plan.yaml
cat manifests/multi-ns-backup.yaml
```

Create the Cluster Backup Plan resource using kubectl:

kubectl apply -f manifests/multi-ns-backup-plan.yaml

Now, inspect the Cluster Backup Plan status using kubectl:

kubectl get clusterbackupplan multi-ns-backup-plan -n default

The output should look similar to the following (notice the STATUS column value, which should be set to “Available”):

NAME                            TARGET                 ...      STATUS
ovh-multi-ns-backup-plan        trilio-ovh-s3-target   ...      Available

Finally, create the Cluster Backup resource using kubectl:

kubectl apply -f manifests/multi-ns-cluster-backup.yaml

Next, check the Cluster Backup status using kubectl:

kubectl get clusterbackup multi-ns-cluster-backup -n default

The output should look similar to the following (notice the STATUS column value, which should be set to “Available”, as well as the PERCENTAGE COMPLETE set to “100”):

NAME                BACKUPPLAN                  BACKUP TYPE   STATUS        ...     COMPLETE
multi-ns-backup     ovh-multi-ns-backup-plan    Full          Available      ...     100

If the output looks like the one above, then all of your important application namespaces were backed up successfully.

Please bear in mind that it may take a while for the full cluster backup to finish, depending on how many namespaces and associated resources are involved in the process.

You can also open the web console main dashboard and inspect the multi-namespace backup (notice how all the important namespaces that were backed up are highlighted in green, in a honeycomb structure).

Re-creating the OVHcloud Managed Kubernetes cluster and restoring applications

An important aspect to keep in mind is that whenever you destroy an OVHcloud Managed Kubernetes Cluster and then restore it, a new Load Balancer with a new external IP is created as well when TVK restores your ingress controller. So please make sure to update your A records accordingly.

Now, delete the whole OVHcloud Managed Kubernetes Cluster using the OVHcloud Control Panel.

Next, re-create the cluster as described in Creating an OVHcloud Managed Kubernetes Cluster.

To perform the restore operation, you need to install the TVK application as described in Step 1 - Installing TrilioVault for Kubernetes. Please make sure to use the same Helm Chart version - this is important!

After the installation finishes successfully, configure the TVK target as described in Step 2 - Creating a TrilioVault Target to Store Backups, and point it to the same OVHcloud Object Storage bucket where your backup data is located. Also, please make sure that target browsing is enabled.

Next, verify and activate a new license as described in the TrilioVault Application Licensing section.

To get access to the web console user interface, please consult the Getting Access to the TVK Web Management Console section.

Then, navigate to Resource Management > TVK Namespace > Targets (in the case of ovh/docs, the TVK Namespace is tvk).

Going further, browse the target and list the available backups by clicking the Actions button on the right. Then, select the Launch Browser option from the pop-up menu (for this to work, the target must have the enableBrowsing flag set to “true”).

Now click on the multi-ns-backup-plan item from the list, and then click and expand the multi-ns-backup item from the right sub-window, similar to:

To start the restore process, click the Restore button. A progress window will be displayed similar to the one below:

After a while, if the progress window looks like the one below, then the multi-namespace restore operation completed successfully.

Checking OVHcloud Managed Kubernetes cluster application state

First, verify all of the cluster Kubernetes resources (you should have everything in place):

kubectl get all --all-namespaces

In the next step, you will learn how to perform scheduled (or automatic) backups for your OVHcloud Managed Kubernetes Cluster applications.

Step 6 - Scheduled backups

The ability to make backups automatically on a schedule is a really useful feature to have. It allows you to rewind time and to restore the system to a previous working state if something goes wrong. This section provides an example for an automatic backup on a 15-minute schedule (the kube-system namespace was picked).

By default, TrilioVault for Kubernetes creates a sample daily, weekly, and monthly scheduling policy after installation. Users can use the same scheduling policies if no changes are required. See the default values of the policies in the TVK UI scheduling policy:

First, you need to create a Policy CRD of the Schedule type that defines the backup schedule in cron format (same as Linux cron). Schedule polices can be used for either Backup Plan or Cluster Backup Plan CRDs. A typical schedule policy CRD looks like this (defines a 15-minute schedule):

kind: Policy
apiVersion: triliovault.trilio.io/v1
metadata:
  name: scheduled-backup-every-15min
  namespace: default
spec:
  type: Schedule
  scheduleConfig:
    schedule:
      -"*/15 * * * *" # trigger every 15 minutes

Next, you can apply the schedule policy to a Cluster Backup Plan CRD, for example, as seen here:

apiVersion: triliovault.trilio.io/v1
kind: ClusterBackupPlan
metadata:
  name: multi-ns-backup-plan-5min-schedule
  namespace: default
spec:
  backupConfig:
    target:
      name: trilio-ovh-s3-target
      namespace: default
    schedulePolicy:
      fullBackupPolicy:
        name: scheduled-backup-every-15min
        namespace: default
  backupComponents:
    - namespace: default
    - namespace: demo-backup-ns
    - namespace: backend

Looking at the file above, you will notice that it’s a basic Cluster Backup Plan CRD, referencing the Policy CRD defined earlier via the spec.backupConfig.schedulePolicy field. You can have separate policies created for full or incremental backups. Hence, the fullBackupPolicy or incrementalBackupPolicy can be specified in the spec.

Now, please go ahead and create the schedule Policy, using the sample manifest provided by the ovh/docs tutorial (make sure to change to the directory where the ovh/docs Git repository was cloned on your local machine):

kubectl apply -f manifests/triliovault-scheduling-policy-every-15min.yaml

Check that the policy resource was created:

kubectl get policies -n default

The output should look similar to this (notice the POLICY type set to Schedule):

NAMESPACE   NAME                           POLICY     DEFAULT
default     scheduled-backup-every-15min   Schedule   false

Finally, create the backupplan resource for the default namespace scheduled backups:

Create the backup plan first for the default namespace.

kubectl apply -f manifests/triliovault-multi-ns-backup-plan-every-15min.yaml

Check the scheduled backup plan status for default:

kubectl get clusterbackupplan triliovault-multi-ns-backup-plan-every-15min.yaml -n default

The output looks similar to (notice the FULL BACKUP POLICY value set to the previously created scheduled-backup-every-5min policy resource, as well as the STATUS, which should be “Available”):

NAME                                  TARGET                 ...   FULL BACKUP POLICY             STATUS
multi-ns-backup-plan-15min-schedule   trilio-ovh-s3-target   ...   scheduled-backup-every-15min   Available

Create a cluster backup resource using the scheduled policy for every 15 minutes.

Create and trigger the scheduled backup for the default namespace:

kubectl apply -f manifests/triliovault-multi-ns-backup-every-15min.yaml.yaml

Check the scheduled backup status for default:

kubectl get clusterbackup multi-ns-backup-15min-schedule -n default

The output looks similar to (notice the BACKUPPLAN value set to the previously created backup plan resource, as well as the STATUS, which should be “Available”):

NAME                             BACKUPPLAN                            BACKUP TYPE   STATUS      ...
multi-ns-backup-15min-schedule   multi-ns-backup-plan-15min-schedule   Full          Available   ...

Now you can check that backups are performed on a regular interval (15 minutes) by querying the cluster backup resource and inspecting the START TIME column (kubectl get clusterbackup -n default). It should reflect the 15-minute delta.

In the next step, you will learn how to set up a retention policy for your backups.

Step 7 - Backups retention policy

The retention policy allows you to define the number of backups to retain and the cadence to delete backups as per compliance requirements. The retention policy CRD provides a simple YAML specification to define the number of backups to retain in terms of days, weeks, months, years, latest, etc.

By default, TrilioVault for Kubernetes creates the sample retention policy sample-ret-policy after installation. Users can use the same retention policy as no changes are required. See the default values of the policy in the TVK UI Retention policy:

Using retention policies

Retention polices can be used for either Backup Plan or Cluster Backup Plan CRDs. A typical Policy manifest for the Retention type looks like this:

apiVersion: triliovault.trilio.io/v1
kind: Policy
metadata:
  name: sample-ret-policy
spec:
  type: Retention
  retentionConfig:
    latest: 2
    weekly: 1
    dayOfWeek: Wednesday
    monthly: 1
    dateOfMonth: 15
    monthOfYear: March
    yearly: 1

Explanation for the above configuration:

spec.type: Defines policy type. Can be Retention or Schedule.
spec.retentionConfig: Describes the retention configuration, such as what interval to use for backup retention and how many to make.
spec.retentionConfig.latest: Maximum number of the latest backups to be retained.
spec.retentionConfig.weekly: Maximum number of backups to be retained in a week.
spec.retentionConfig.dayOfWeek: Day of the week to maintain weekly backups.
spec.retentionConfig.monthly: Maximum number of backups to be retained in a month.
spec.retentionConfig.dateOfMonth: Date of the month to maintain monthly backups.
spec.retentionConfig.monthOfYear: Month of the backup to retain for yearly backups.
spec.retentionConfig.yearly: Maximum number of backups to be retained in a year.

The above retention policy translates to:

On a weekly basis, keep one backup each Wednesday.
On a monthly basis, keep one backup on the 15th day.
On a yearly basis, keep one backup every March.
Overall, I want to always have the 2 most recent backups available.

The basic flow for creating a retention policy resource goes the same way as with scheduled backups. You need a Backup Plan or a Cluster Backup Plan CRD defined to reference the retention policy, and then you need to have a Backup or Cluster Backup object to trigger the process.

A typical Cluster Backup Plan configuration with retention set looks like this:

apiVersion: triliovault.trilio.io/v1
kind: ClusterBackupPlan
metadata:
  name: multi-ns-backup-plan-15min-schedule-retention
  namespace: default
spec:
  backupConfig:
    target:
      name: trilio-ovh-s3-target
      namespace: default
    retentionPolicy:
        name: sample-ret-policy
        namespace: default
  backupComponents:
    - namespace: default
    - namespace: backend

Once you apply the Cluster Backup Plan, you can check it using the following command:

kubect get clusterbackupplan -n default

The output should look similar to the example below:

NAME                                            TARGET                 RETENTION POLICY    ...      STATUS
multi-ns-backup-plan-15min-schedule-retention   trilio-ovh-s3-target   sample-ret-policy   ...      Available

Notice that it uses a retentionPolicy field to reference the policy in question. Of course, you can have a backup plan that has both types of policies set, so that it can perform scheduled backups as well as deal with retention strategies.

Using cleanup policies

Having so many TVK resources (each one responsible for various operations such as scheduled backups, retention, etc.), something is likely to go wrong at some point in time. It means that some of the previously enumerated operations might fail due to various reasons, such as inaccessible storage, network issues for NFS, etc.

So, what happens is that your OVHcloud Managed Kubernetes Cluster will get crowded with many Kubernetes objects in a failed state.

You need a way to garbage collect all those objects in the end and release associated resources to avoid trouble in the future. Meet the Cleanup Policy CRD:

apiVersion: triliovault.trilio.io/v1
kind: Policy
metadata:
  name: garbage-collect-policy
  namespace: tvk
spec:
  type: Cleanup
  cleanupConfig:
    backupDays: 5

The above cleanup policy must be defined in the TVK install namespace. Then, a cron job is created automatically for you that runs every 30 mins, and deletes failed backups based on the value specified for backupdays within the spec field.

This is a very neat feature that TVK provides to help you deal with this kind of situation.

Conclusion

In this tutorial, you learned how to perform one-time and scheduled backups and to restore everything. Having scheduled backups in place is very important as it allows you to revert to a previous snapshot in time if something goes wrong along the way. You walked through a disaster recovery scenario as well. Next, backup retention plays an important role as well, because storage is finite, and sometimes it can get expensive if too many objects are included.

All the basic tasks and operations explained in this tutorial are meant to give you a basic introduction and understanding of what TrilioVault for Kubernetes is capable of. You can learn more about TrilioVault for Kubernetes and other interesting (or useful) topics by following the links below:

TVK CRD API documentation
How to Integrate Pre/Post Hooks for Backup Operations, with examples given for various databases
Immutable Backups, which restrict backups on the target storage from being overwritten
Helm Releases Backup, which shows examples for Helm release backup strategies
Backups Encryption, which explains how to encrypt and protect sensitive data on the target (storage)
Disaster Recovery Plan
Multi-Cluster Management
Restore Transforms
Velero Integration to Monitor Velero Backups

For more information and tutorials, please see our other Managed Kubernetes or Platform as a Service guides. You can also explore the guides for other OVHcloud products and services.

If you need training or technical assistance to implement our solutions, contact your sales representative or click on this link to get a quote and ask our Professional Services experts for a custom analysis of your project.

*S3 is a trademark filed by Amazon Technologies, Inc. OVHcloud's service is not sponsored by, endorsed by, or otherwise affiliated with Amazon Technologies, Inc.

Introduction

How TrilioVault for Kubernetes works

Understanding TrilioVault application scope

Back up and restore workflow

Table of Contents

Requirements

Instructions

Step 1 - Installing TrilioVault for Kubernetes

Installing TrilioVault operator and manager using Helm

TrilioVault application licensing

Installing TVK application licensing

Renewing TVK application license

Step 2 - Creating a TrilioVault target to store backups

Step 3 - Getting to know the TVK Web management console

Getting access to the TVK Web management console

Exploring the TVK Web Console user interface

Step 4 - Helm release backup and restore example

Creating mysql-qa helm release

Creating mysql-qa Helm release backup

Deleting mysql-qa Helm release and resources

Restoring mysql-qa Helm release backup

Verifying application integrity after restoration

Step 5 - Backup and restore whole cluster example

Creating the OVHcloud Managed Kubernetes cluster backup using the TVK Multi-Namespace backup feature

Re-creating the OVHcloud Managed Kubernetes cluster and restoring applications

Checking OVHcloud Managed Kubernetes cluster application state

Step 6 - Scheduled backups

Step 7 - Backups retention policy

Using retention policies

Using cleanup policies

Conclusion

Related articles