Couple questions about CRDB


#1

Hi there.

I’ve starting to look into CRDB and I’m loving it.
In the process, two questions came to my mind:

  • why do I need to provide the address of 3-5 other nodes when adding another node to a cluster using --join (according to the documentation)? Why not just one?
  • does auto re-balancing only look at replica count or does it take into account the disk space and current load (imagine I have servers with different disk capacities)?

(Jesse) #2

Hi @marigonzes,

Great to hear that you’re enjoying CockroachDB!

  • On your first question about the --join flag, we know our docs are missing details there, and we have this issue to fill the gap. As you’ll read, providing multiple join addresses makes it much more likely that at least one of the target nodes will be reachable and, thus, the new node will be able to join the cluster successfully. If you use only one join address, and that target node happens to be down, your new node won’t be able to successfully join the cluster.

  • On your second question about re-balancing, in 2.0, rebalancing is based purely on replica count. However, in 2.1, rebalancing of replicas will factor in load and disk space as well. Range leaseholders will also be rebalanced by load. Our docs will be updated to reflect these changes within the next weeks. You can track that work here, if you like.

Hope that helps.

Best,
Jesse


#3

Thank you for your answer, @jesse.
Looking forward to the 2.1 release :smiley:


(Jesse) #4

My pleasure, @marigonzes. I was a bit too quick, however. v2.1 will have load-based leaseholder rebalancing and very likely load-based replica rebalancing, but not size-based replica rebalancing.