We’ve noticed this situation in our cluster twice now. The most recent time being right after an automatic kubernetes upgrade. We run a 3 region cockroach cluster on kubernetes with 7 nodes total and this region hasn’t been part of the cluster for some time.
The 3 nodes in us-west2 (n13, n15, n16) will flip back and forth between “Dead Nodes” and “Decommissioned Nodes” every few minutes or so. Last time it was resolved by cycling through the nodes in the cluster one at a time until what looked like the “bad” node was cycled and the behavior stopped. Would be nice to get a concrete answer as to why this is happening.