Durability and replication zone configurations/tuning

I’ve been reading over the replication zone doc.

Suppose I use the three datacenter example but I have 100 nodes in each datacenter - does this mean that the raft group participants will always be 3 nodes - one in each datacenter? Would node n in datacenter 1 necessarily own the same ranges as node n in datacenter 2, or would the range assignment be more or less random within a datacenter?

Suppose I have one datacenter with two availability zones and a second datacenter that is 100ms away. Would it be possible to configure synchronous replication to occur within the single datacenter and tune the second datacenter to be up-to some threshold behind via asynchronous replication?

If you recall the interesting white paper on cloud Spanner, they basically claim to be able to build a CA system due to control of the network.

How would you recommend one deploy CRDB so that latency is low in an environment where one has the situation I described above (two datacenters that are far away)?

That’s what will happen if you apply metadata to each node so that they know which datacenter they’re in. To do so, use the --locality flag when starting each process as described in our docs. If you don’t lett each server know its locality, you’ll get a pretty random distribution of replicas across the datacenters.

If you’d like a different sort of behavior, you can try configuring replication zones.

It’ll be more or less random within a datacenter. It certainly won’t match up perfectly.

Writes are able to return as soon as a majority of replicas have persisted them to disk. If two of your replicas are close together, then writes to data ranges whose raft leader is in one of those two datacenters can commit at low latencies, with the remote replica essentially following along asynchronously. If the raft leader is in the distant datacenter, the commit times will be higher.

We’ve recently added functionality to move leases for each raft group toward where the most requests are coming from in high-latency clusters. For now the feature is guarded by an environment variable, but it will soon be made the default. You can read about the feature in its RFC, and if you’d like to try it, just set the environment variable COCKROACH_ENABLE_LOAD_BASED_LEASE_REBALANCING="true" before starting up your servers.

I’d recommend assigning appropriate localities to each node as mentioned above and enabling the load-based lease rebalancing as mentioned above. If that doesn’t work well for your use case, I’d love to hear more about it. There’s also a tuning parameter that you can experiment with if you find it doesn’t work well by default (but hopefully you’ll still give us feedback to help improve it).

Best,
Alex

1 Like