Hi, I’d like to reason about the choice of small (MB) v.s. large ranges (GB) at per host level, and if would be great if you guys can share the thoughts.
Support we have a fix number of hosts, since CockroachDB has a small range size (64MB), a single host can have thousands of ranges, which forms thousands of raft groups with other nodes. By having small range size, range repair and movement would be faster, especially if snapshot will be streamed over the network. However, what’s the performance (throughput) implication of having small ranges in this scenario? Our preliminary experimental results with raft is that WAL sync is expensive, and we are not improving the per-node throughput at all by running multiple raft groups per node. I am not sure if you have similar experience when benchmarking CRDB. If so, can I say hosting thousands of replicas per node is mostly for reducing the total number of nodes? This to me affects many important design decisions such as multi-raft.