Network - router redundant?


(Benny) #1

A quick question.

Is it possible with CockroachDB to include in the --join parameter, more then then one network range?

For example: Lets

Node 1: Public IP 200.x.x.x, Private physical network 10.x.x.1
Node 2: Public IP 200.x.x.x, Private physical network 10.x.x.2

Now, if the router for the Private network fails, it means all nodes are unable to communicate and by defacto all the servers will go out of sync or stop responding ( not sure what will happen in a situation like that but it sounds bad ).

Our idea was to have a secondary physical router that connects to more physical network cards on the servers.

Node 1: Public IP 200.x.x.x, Private 10.x.x.1, Private 11.x.x.1
Node 2: Public IP 200.x.x.x, Private 10.x.x.2, Private 11.x.x.2

If we lose the 10.x.x.x router, that the system falls back on the secondary 11.x.x.x. It also helps as a backup if a error on the motherbord or network card happens, so it can fall back on a other physical network port.

But its unclear from the manual of the nodes can only --join a single network range, or they can fall back on multiple and can still figure out that Node 1 is -> 10.x.x.1, 11.x.x.1? So if a router fails or needs to be replaced, the other nodes know they can find Node 1 on any of the two or potentially more IPs?


(Ben Darnell) #2

If each node has two network addresses that you want to use, you can include both addresses in a DNS name for each machine (and then use that DNS name in the --join and --advertise-host flags for each node).

Note that you don’t need to give each node two physical connections to be able to survive the failure of a router. You can set up multiple routers for high availability with protocols like VRRP. Switches are another story: you do need multiple physical connections to survive switch failures. But as long as your machines are distributed across different switches, you may not need to handle this kind of failure - just let CRDB’s built-in fault tolerance handle it. Just include enough detail in your --locality flag to tell CRDB which nodes share a switch.