Performance vs replication factor

Hi, Is there a matrix of replication factor vs write performance? I’m guessing a higher replication factor probably would have higher negative impact on the performance. But I’m not sure of by how much.

Another question is that for some of my data it is not strictly necessary I need to have them replicated 3 times. I can afford to loose them on ‘rare’ occasion and they can be recreated. For this kind of data, does it buy me anything if I configure them with replication of only 2 or even 1?

Thanks.

The performance impact depends on your workload and resources. Data is written to all replicas in parallel, so the impact on latency is usually small. However, more data has to be written, so if you’re doing enough writes to saturate your network and/or disk, you can expect a reduction in throughput proportional to the replication factor (with 5 replicas, you’ll do roughly 66% more disk writes and twice as much network traffic as with 3 replicas).

At this time read performance is unaffected by the replication factor, since all reads are served by the leader of each range. In the future it may be possible to use other replicas for reads, in which case increasing the replication factor could increase read performance.

CockroachDB does not currently tolerate data loss well, so even if you could recreate some data with a replication factor of 1 or 2, the cluster would get into a bad state if the table were lost and it wouldn’t let you drop it and recover. So for now it’s important to set the replication factor on all your tables high enough to avoid data loss.

1 Like