Multi-Datacenter Fast Read Setup

We’re considering crdb for a workload where we desire fast, consistent reads but slow writes are perfectly acceptable, I’m planning to test crdb but I’d like to make sure create a good setup for a successful test. I’ve outlined what I believe to be the behavior of crdb in this scenario, but please correct me if I’m wrong.

There will be one datacenter in each of three continents (Singapore, Miami and Amsterdam). Read rate will be 10x, 100x and 100x write rate (for production use, we’ll be in the 100x range, but as long as I’m testing, might as well test a range of performance).

I plan to start 2 nodes in each datacenter, with a replication factor of 3. Each node will have a --locality={SIN,SEA,AMS} set. My understanding is that crdb will place one copy of each data in each datacenter. We’ll also set the raft-tick-interval flag to 150% of the 95th percentile ping time in the cluster.

So when reading, the read should be fully satisfied from the crdb instance in the local datacenter, providing a fast read. Writes, on the other hand, will be no faster than the slowest inter-datacenter round trip time.

Are there any other settings which will improve this deployment? Is my assertion that a local node can satisfy the read without having to go out of the datacenter correrct (that is, it may need to talk to other nodes in the same datacenter, but it shouldn’t have to make a call to another datacenter)? Finally, in production, we’ll need more datacenters (5 at the moment and growing). Do we need to keep the number of datacenters odd (potentially by running “MIA-A”/“MIA-B”, for example)?

Reads are served by the node that holds the range lease for the given piece of data. We’ve recently introduced some optimizations to ensure that these leases are held by nodes that are close to the source of traffic, so as long as your traffic patterns are such that certain pieces of data are accessed from certain locations (or if your access patterns follow the sun), you should see most of your reads served from local replicas. This is all a little cutting edge and we’re still tweaking the heuristics.

Do we need to keep the number of datacenters odd (potentially by running “MIA-A”/“MIA-B”, for example)?

No, the number of datacenters does not need to be odd. However, as you expand beyond 3 datacenters you may find that it gets more complicated to configure things in a way that both balances the load and keeps the data where you want it. As long as you always have three regions then it should do the right thing (set the --locality flag to something like --locality=region=eu,dc=ams and it will recognize the hierarchy), but this is not something that has seen much testing so far.