Is it possible to configure async replication?

From my understanding after reading the docs, CockroachDB uses synchronous replication, and that causes quite a bit of delay when updating tables on the “secondary” nodes (the nodes except for the one on which cockroach init was run on).

For my use case, I don’t need data to be synchronized immediately, and I have five nodes in separate continents. Is there a way to configure async replication?

Hi @stevewatson301,

You’re right that CockroachDB uses synchronous replication. More specifically, for writes, CockroachDB uses a consensus protocol called Raft to guarantee that, for a given piece of data, a majority of replicas of that data (2 of 3, by default) agree before the change is committed. Since each replica is always on a different node (for resiliency to failure), network latency is a factor. However, there are various SQL and operational techniques to optimize performance. You might be interested in our Performance Tuning tutorial for more insights along those lines.

Replication, by the way, has nothing to do with which node received the cockroach init request to initialize the cluster. For each piece of data, rather, there is a replica “leader” and replica “followers”. That perf tuning tutorial explains this more. The Replication Layer page in our architecture docs might also be helpful.

To address your use case, there is currently no way to configure CockroachDB to be “eventually consistent” like some NoSQL databases. It’s not designed for that. Instead, you have two options, I think. You can optimize the performance of reads and writes as described in that perf tuning tutorial and retain the fault tolerance benefits of CockroachDB. Or you can disable replication entirely by running this command:

Insecure cluster:

cockroach zone set .default --insecure --disable-replication --host=<address of any node>

Secure cluster:

cockroach zone set .default --certs-dir=<path to certs directory> --disable-replication --host=<address of any node>

It’s important to note that, if you disable replication, and have only 1 copy of each piece of data, you completely lose the fault tolerance benefits of CockroachDB. For example, in this case, if the node containing data for a specific table goes down, that data becomes completely unavailable, whereas in a replicated scenario, as long as a majority of the nodes with that table’s data are online, all is well.

I hope this helps. Please let me know if you have follow-up questions.

Best,
Jesse

Hi @jesse,

Thank you for the reply. My use case is for an API where users are credited with a certain number of API calls/month, and there are geographically distributed nodes. A request to any one of the nodes causes a decrement in the credit. So, as you can tell, turning off replication won’t be appropriate at all :stuck_out_tongue:

Is there some documentation which I can use to get the best performance in such a scenario?

Hey @stevewatson301,

So there are a couple things to keep in mind: first, as Jesse noted, the fault tolerance benefits we provide require a cluster of at least three nodes. Second, due to the nature of how we guarantee that tolerance, you need nodes to reach consensus before writes are committed.

One option that might be worth pursuing if you want fault tolerance, lower latency, local data storage, and eventual consistency across regions would be to run five separate clusters and use change data capture. You could also increase the number of nodes within each DC and use localities as outlined in Jesse’s performance doc above.

Hope that helps. Let us know if you have further questions.

Hello, are there any changes on this subject since 2018? I have a similar, but even simpler case: I need to maintain a read only replica in another country (for legal purposes) and some replication lag is acceptable but the “slave cluster” must be capable of disconnecting from the “master cluster” and becoming independent. The CDC based solution is complicated (e.g. it is rather hard to monitor failures and restart after failure). Are there other options? Also can you please tell more about the “increasing number of nodes and using localities” ? Thanks.