Couple of questions

  1. I have 3 servers with identical locality. Replication factor is set to 3. I added a 4th server with different locality. Should some of the data be moving onto the 4th server? From what I can tell, this is not happening.

  2. Can the join address be the haproxy load balancer of the cluster?

@pooper

For (1), it sounds to me like your data should be moving. It can take a
while to finish rebalancing but the process should start within a few
minutes. Do you see all 4 nodes in the Admin UI? How much data do you have
in the cluster (if it’s small, it may be that it’s within our tolerances
for “sufficiently balanced”). Have you set any replication zone constraints
(https://www.cockroachlabs.com/docs/stable/configure-replication-zones.html
)?

For (2), we initially hoped something like that could work (and it tends to
in small clusters) but in large clusters there is a chance of ending up
with disconnected islands. The best practice is to pick 3-5 nodes and use
them as the join targets on all nodes.

I see all 4 nodes in the admin UI. I haven’t changed any replication zone settings from the default.

I have not added much data so that’s probably it.

Here’s the output of node status: https://pastebin.com/raw/BZh3Zg2y

@pooper can you also include the contexts of visiting
/_status/allocator/node/1 from one of your nodes? Thanks!

Sure, here you go: https://pastebin.com/raw/zhvABz0q

I figured out why the data was not being rebalanced. Port 26257 was closed! After opening it in the firewall, data started being moved over.