Instructions for multi-regional deployment on Regional GKE kubernetes cluster

Hi

Can you please provide instructions on how to deploy CockroachDB on a multiple GKE regional clusters as the documentation only lists the old type of GKE clusters which are zonal so its out of date.

Ref: https://www.cockroachlabs.com/docs/v19.1/orchestrate-cockroachdb-with-kubernetes-multi-cluster.html#main-content

Hi Jared,

Sorry that our docs don’t cover regional clusters yet. Fortunately, I don’t think that the steps for regional GKE clusters should be too different than for zonal GKE clusters.

You may already have regional GKE clusters running. If not, you can use a command like below to create one per region:

gcloud container clusters create crdb --region=us-east1 --num-nodes=1

This will create a regional cluster with one node per zone. So this command will actually create a three node cluster.

Starting from step 2 in https://www.cockroachlabs.com/docs/v19.1/orchestrate-cockroachdb-with-kubernetes-multi-cluster.html#step-2-start-cockroachdb, you should change the context variable in setup.py file to look something like the below. Notice that we’ve dropped the zone suffixes from the context keys. You will need to replace gke_cockroach-shared_us-east1_cockroachdb1 with the appropriate contexts for your Kubernetes clusters.

context = {
    'us-east1': 'gke_cockroach-shared_us-east1_cockroachdb1',
    'us-west1': 'gke_cockroach-shared_us-west1_cockroachdb2',
    'us-central1': 'gke_cockroach-shared_us-central1_cockroachdb3',
}

You can leave the regions value set to {}.

In my testing just now, I also needed to add a firewall rule to allow internal traffic before multiregion traffic worked:

gcloud compute firewall-rules create allow-internal --direction=INGRESS --priority=1000 --network=default --action=ALLOW --rules=all --source-ranges=10.0.0.0/8

Now, running the setup.py file should be enough to get you a working multiregion cluster. If you have any difficulties with that, please let us know.

Hi Joel

Thanks for the fast reply and all that i was missing was how to correctly configure the python script to deploy into both of my GKE clusters.

Instead of adding the ingress firewall rule for 10.0.0.0/8, i’ve added it already for both of the region cidr ranges that GCP automatically creates as part of GCP project.

I will let you know if i have any further difficulties or questions.

Hi Joel

What i really want to be able to do is:

Deploy a multi-region cockroachDB into a existing namespace as we have a unique namespace per environment so we would want a unique cockroachDB per env too.

The dry-run mode i mentioned above would be very helpful for us to make this happen.

Hi Joel

Good news.

** Successful deployment **
I was able to stand up the CockroachDB in both of my GKE clusters and it was a fairly quick process that only took 4 minutes to complete.

** Python Scripts **
So the python scripts are fairly basic (one to create and the other to delete the cockroach deployment) so do you plan to expand on it further, such as:

  • allowing size changes to the DB disks and then rolling the nodes one at a time after replication completed so that there is zero data loss / no outage change
  • managing cockroachDB version upgrades nicely and with no outage deployments
  • being able to add in more kubernetes clusters at a later date
  • requires python 2.7 and it not compatible with later versions eg 3.7.4
  • it can not recover from partial deployments, meaning you cant simply rerun it but you have to delete the deployed resources (eg secrets) from cluster 1 to continue fresh install. I had a case where the context name for the 2nd cluster was not 100% correct.
  • adding a dry-run mode to create the certificates and yaml files but dont apply them. We have a pull deployment architecture/structure so its not always possible to use kubectl to push yaml files to either cluster.

If this is already answered then feel free to point me in the correct direction as i haven’t read all of the documentation or cockroach forum :slight_smile:

** Multi region readme **
I will create a PR on the readme listing my updates/recommended changes. wow 304 open PR’s already.

Thanks
Jared

Hi Jared,

Glad it worked for you. I agree that the script is fairly basic, and I think you’ve provided a lot of really good feedback. Improving this script is definitely on my radar post-19.2 release, and I’ll take all of that into consideration. Another thing I’ve been thinking about is how to make this script work across different cloud providers, such as AWS or Azure.

I’ll also pass your feedback along to the docs team, who I’m sure are interested in how they can help with documenting disk resizes and version upgrades within a Kubernetes deployment.