Cluster is failing to start with 2/3 nodes okay

Hi I need some urgent help.
I was attempting to scale down when i accidentally deleted 1 of the persistent volumes. I restarted the node expecting the DB to start up normally and copy over the replicated data but am just getting the following errors in the 2 pods
W210510 23:51:24.957149 153 storage/replica_range_lease.go:554 can’t determine lease status due to node liveness error: node not in the liveness table

github.com/cockroachdb/cockroach/pkg/storage.init.ializers

/go/src/github.com/cockroachdb/cockroach/pkg/storage/node_liveness.go:44

runtime.main

W210510 23:51:23.929602 213 storage/node_liveness.go:484  [n1,liveness-hb] failed node liveness heartbeat: operation "node liveness heartbeat" timed out after 4.5s:
  - aborted in distSender: context deadline exceeded

W210510 23:51:20.742986 162 storage/store.go:1530  [n1,s1,r6/1:/Table/{SystemCon…-11}] could not gossip system config: [NotLeaseHolderError] r6: replica (n1,s1):1 not lease holder; lease holder unknown

Hi! Did you manage to resolve this?
Like you say, deleting a persistent volume generally should not be a problem - it’s equivalent to one of the nodes disappearing. New nodes are supposed to be able to join the cluster and data should be upreplicated to them.

The “can’t determine lease status due to node liveness error: node not in the liveness table” error should be transient; it should clear a few seconds after the new node is started. You’re probably running 20.1 or an older version, since I believe starting in 20.2 this error should not be possible at all any more. And the other errors could have different causes, but they’re most likely related to the first. In the upcoming

I am not running the latest version. I completely failed to recover from the error even after allowing a long time for self-correction. I eventually ended up locating and manually coping over data from another PV part of the original cluster that I had scaled down.

The cluster came back but I have had 23 underreplicated ranges for many days now. Looking at the diagnostics, the ranges seem to be with a leader and 2 followers, AKA quorum. Here is an example if you can help me figure what’s wrong or if am misinterpreting, and how to correct the under replication.

So regarding the 23 under replicated, I had to decommission the nodes in addition to scaling down.

I’m confused about exactly what happened to you, but I believe the under-replicated report you got is due to the fact that for system ranges (as opposed to user ranges), the default desired replication factor is 5x (not 3x). As long as the cluster has only 3 nodes (or also 4 nodes I think), this doesn’t result in any range being reported as under-replicated. But if the cluster gets a 5th node (even a dead one), then the ranges become being reported as under-replicated until the number of nodes in the cluster goes back down below 5 (though decommissioning).
You can set the replication factor for system ranges with ALTER RANGE system CONFIGURE ZONE USING num_replicas = 3;.

This is all fairly confusing, and the UI doesn’t help enough. We might do better in the future.

So regarding the 23 under replicated, I had to decommission the nodes in addition to scaling down.

As far as CRDB in concerned, decommissioning is how you scale down. If you simply kill a node, that node is still considered to be part of the cluster for various purposes.