• This email address is being protected from spambots. You need JavaScript enabled to view it.
    • +44 (0) 20374 57368

How we migrated our GKE cluster to another region

We desided for a number of business reasons to move one of our existing Kubernetes cluster new a geographic region.

The thing we were most worried about is that we had persistent volumes attached to MySQL instances for our test environments running in k8s.

There isn’t a straightforward way for this. One common way is to create a snapshot of etcd but we’re on GKE so that’s out of the question. Luckily we found Ark.

Ark is a disaster recovery tool for Kubernetes clusters. It can take backups of the whole cluster with the ability to restore it using a single command. We can even have it run on a schedule. Persistent volumes are also taken care of. It has good documentation so setting it up was almost a breeze if not because of a bug with RBAC in GKE.

Download

A simple git clone This email address is being protected from spambots. You need JavaScript enabled to view it.:heptio/ark.git was all I did to download Ark. Its master branch is frequently updated and is not stable. The maintainers recommend checking out the latest tagged version. At this time, the latest release is v0.9.5.

Setting it up

Ark works by creating custom resources in k8s for its operations conveniently defined in a single yaml file.
I had to kubectl apply the yaml file to the American cluster and the shiny new European cluster where we’re moving into.

This is where the RBAC bug on GKE appears:
User "This email address is being protected from spambots. You need JavaScript enabled to view it." cannot create clusterrolebindings.rbac.authorization.k8s.io at the cluster scope: No policy matched.

To work around this I had to have my Google account granted with the cluster-admin role in both clusters:
kubectl create clusterrolebinding paul-cluster-admin-binding --clusterrole=cluster-admin --user=This email address is being protected from spambots. You need JavaScript enabled to view it.

Ironically, it spits out the same error unless you’re an account with the Owner IAM Role.

Apparently, this is a known issue on GKE:

Because of the way Container Engine checks permissions when you create a Role or ClusterRole, you must first create a RoleBinding that grants you all of the permissions included in the role you want to create. An example workaround is to create a RoleBinding that gives your Google identity a cluster-admin role before attempting to create additional Role or ClusterRole permissions. This is a known issue in the Beta release of Role-Based Access Control in Kubernetes and Container Engine version 1.6.

Cloud Storage Bucket

Apart from persistent volumes, Ark stores its backups in a cloud storage bucket. This bucket should be exclusive to Ark because each backup is stored in its own subdirectory in the bucket’s root. A service account will be needed to authorize Ark to upload files into the bucket.

Service account

I created a service account just for Ark to use. It will need read and write access to the bucket. In GKE, persistent volumes are just disks attached to the nodes so I had to give it permissions for those too. These are permissions given to the service account:

     compute.disks.get
     compute.disks.create
     compute.disks.createSnapshot
     compute.snapshots.get
     compute.snapshots.create
     compute.snapshots.useReadOnly
     compute.snapshots.delete
     compute.projects.get

The Ark server config

At this point the bucket has been created and Ark has been allowed upload to it. Now it will need to know which bucket to use by setting the Ark Config (a custom resource defined by Ark):

# examples/gcp/00-ark-config.yaml
 ...
backupStorageProvider:
  name: gcp
bucket: neso-cluster-backup
 ...

The Ark server Deployment

To hand off the service account to Ark a k8s secret named cloud-credentials containing the service account key will have to be created.

# download service account key
gcloud iam service-accounts keys create ark-svc-account \
     --iam-account $SERVICE_ACCOUNT_EMAIL

# create secret
kubectl create secret generic cloud-credentials \
    --namespace heptio-ark \
    --from-file cloud=ark-svc-account

In the Ark Deployment yaml file, there wasn’t anything that needed to be changed. All that’s left to start the server is to kubectl apply the Config and the Deployment.

Generating a backup

After everything’s been set up on both clusters and the Ark client install locally. It’s time to put Ark to the test. Making sure kubectl's context was set to the US cluster, with fingers crossed we generated the backup:

$ ark backup create us-cluster --exclude-namespaces kube-system,kube-public,heptio-ark

Gave it a few minutes and then:

$ ark backup get
NAME                           STATUS      CREATED                         EXPIRES   SELECTOR
us-cluster                     Completed   2018-09-21 15:59:35 +0800 +08   30d       <none>

Restoring the backup

The backup includes all the resources from pods to ingresses. We wanted to keep the IP addresses we used in the old cluster. To free up the IP addresses, down go the ingresses in the old cluster.

Now setting the kubectl context to the new cluster in Europe. It took a while for the cluster to see the backup but it did appear eventually:

$ ark backup get
NAME                           STATUS      CREATED                         EXPIRES   SELECTOR
us-cluster                     Completed   2018-09-21 15:59:35 +0800 +08   30d       <none>

$ ark restore create --from-backup us-cluster

Ark was able to restore everything except for the persistent volumes. Our applications could not connect to the databases. Taking a closer look, it appears that Ark created the disks but they were in the region where the backups were created. The maintainers are aware of this issue and added the fix for this in V0.10.0.

We can’t wait for that release, though. We had no choice but to move the databases out of k8s. We ultimately decided to spin up a CloudSQL instance and stick our test environments’ databases there.

Conclusion

Ark is an awesome tool. Although the migration did not go as smooth as it should have, some good came out of it. It forced us to move our database outside of kubernetes which we shouldn’t be doing in the first place. Also, we now have regular backups of our new cluster.

Share this Article

Allow your partners to work closer by integrating your technology into their business.

Paul Iway

Senior Developer