Rolling join new node of cluster?

Hi, I am new to CockroachDB.
I’m looking whether if any practice that builds a cluster which initial with one node and join second and third node after then.

It looks like while I start a new node, I need to provide the other node address as well as below.

cockroach start --insecure --store=node2 --listen-addr=localhost:26258 --http-addr=localhost:8081 --join=localhost:26257,localhost:26258,localhost:26259 --background

I wonder can it be done by joining one by one if we need more scalability later.
Thanks for your input.

Hi @itplayer,

The --join flag is required when you want a new node to join an existing cluster.

When you want to add new nodes to your cluster, just point the flag to a few of the nodes already in your cluster.

Thanks @mattvardi
So the first node doesn’t need to specify the --join flag and other node’s address?

Hey @itplayer,

For starting a single node cluster, I think the guidance here really sums up what is required. (There are no join flags).

When you are ready to scale out to multiple nodes, you can provide the other nodes.

Thank you.
However, let’s say if I have the first node, then I setup the second node then start with the second node with join flag with node one’s address, this is simple and clear.
Now, if I have the third node, it is definitely that I need to bring the third one with join flag as well as current three node’s address. At this moment, do I need to restart the previous two nodes to let them know current there are three nodes should be formed as c cluster? Or after third node join, all the node will be notified and store cluster info event after any of them restrat?

Thanks

After the third node joins, it’s gossip information will enter the cluster and all nodes will be aware of it :slight_smile:

When we start with 1 node and then add more nodes later, will rebalance happen automatically.
Also, is there restriction on the number of nodes in the cluster.
For example: Can it got from 1 to 2 to 3 and so on OR does it have to go from 1 to 3 to 5 nodes. If 2 is allowed, how is fault tolerance achieved.

Thanks in advance.

Hi @bobk,

When we start with 1 node and then add more nodes later, will rebalance happen automatically.

Yes, rebalancing will happen automatically.

Also, is there restriction on the number of nodes in the cluster.

There is no restriction on the number of nodes in a cluster

For example: Can it got from 1 to 2 to 3 and so on OR does it have to go from 1 to 3 to 5 nodes. If 2 is allowed, how is fault tolerance achieved.

You can scale one by one, we support both an even and odd number of nodes. Generally, we discuss fault tolerance on the replica level. We recommend at minimum a three-node cluster to benefit from CockroachDB’s fault tolerance. 2 is allowed but is not recommended and we require an odd number of replicas in order to achieve quorum. This means that on a 2 node cluster you will have to stay with 1x replication and the loss of one node can lead to an unavailable cluster.

This section should explain what you’re looking for.

A three node cluster will have a 3x replication factor which means you will be able to lose one node and will still be able to serve reads and writes.

Thanks Matt. The link you posted lists some DB commands that needs to be executed to enable replication after going from 1 node to 3 nodes.
Example:
ALTER RANGE default CONFIGURE ZONE USING num_replicas = 3;
ALTER RANGE system CONFIGURE ZONE USING num_replicas = 5; <===
ALTER database system CONFIGURE ZONE USING num_replicas = 5; <===
and so on.

Why 5 replicas (used in these examples) for a 3 node cluster ?

Thanks in advance.

5 replicas are the default number of replicas for the system information so there is no adjustment needed when you scale up to 5 nodes.

Users can scale up to 5 nodes for added storage and compute but don’t necessarily have to change replication factor. The reason for this is by increasing the replication factor, we increase resolliency to failures but we also introduce additional latency when writing to the database see this for reads and writes overview.

However, system information is important and we want to be able to survive a two node failure if we have 5+ nodes, and that’s why it’s there by default. I believe the 5x replication on a < 5 node cluster for system information is a special case and these ranges won’t appear as under-replicated.