How to accelerate new nodes rebalance speed

I have a cluster of 3 nodes. When I add 1 new node to the cluster, the rebalance
speed is slow, the details are:

  • crdb V2.0.2
  • 3 nodes with 100,000,000 rows, about 1900 ranges
  • add node 4
  • it takes 10 hours when all 4 nodes is balanced

How can I speed up this rebalance?

When all 4 nodes balanced, I killed node 4 to test
the rebalance speed of shrinking cluster. It takes
about 1 hour.

Why rebalance of shrinking is much faster?

Hey @haomiao,

There are two cluster settings that govern the rebalancing of data across the cluster:

  • kv.snapshot_recovery.max_rate, and
  • kv.snapshot_rebalance.max_rate.

The first rate limits the rate of up-replicating data that is currently under-replicated, and defaults to 8 MiB/s. The second rate limits the up-replication of data onto a new node, and defaults to 2 MiB/s. So it would make sense that after you killed a node, the rebalancing speed increased substantially. More details on cluster settings are available in the documentation here.

Both settings can be safely increased if there isn’t a heavy workload running in the cluster. It’s worth testing to see the impact on performance when increasing the limits.