Remove decommissioned nodes that don't exist anymore


(Ivan Kresic) #1

Hi there,

I have two decommissioned nodes, one with an IP that is no longer available for use (old server) and one duplicate (same IP and port of a currently live node). The quit command obviously doesn’t help. Any way to remove decommissioned nodes manually, from the database or from files on the disk or by an extra command?

Thanks in advance


(Jesse) #2

Hi @kresic.ivan,

What version of CockroachDB are you running? If you’re on 2.0, decommissioning nodes should remove them from the web ui and the cockroach node responses. For both cases, I’d suggest following our docs on decommissioning dead nodes.

If you have any trouble, please let us know here.

Best,
Jesse


#3

I’m not sure about OP, but I have the same problem with v2.0.3.

I destroyed and re-created each of the 3 CockroachDB servers, one by one, while allowing the new one (with a new IP address) to connect to the cluster before moving on to the next.

I was left with 3 dead nodes, so I followed your guide to decommission them.

Now I have 3 “decommissioned” nodes in the dashboard and no way to remove them since, as OP pointed out, they no longer exist so the “quit” command is unhelpful.

Also, I’m not sure if this is related, but it seems wrong. It says “decommissioned since” and gives the time they became dead, which was actually yesterday for 2 of them, even though I only decommissioned them a few minutes ago.


(Ivan Kresic) #4

Hi @jazoom,

thanks for revisiting the issue, I forgot about it. I simply gave up, since no harm is done, it’s just a bit annoying. I have two decommissioned nodes, one decommissioned 3, and the other one 4 months ago. Any instructions for manual removal, since the quit command does not apply, would be helpful and appreciated @jesse.

Thanks


(Raphael 'kena' Poss) #5

You can decommission a node that is not alive any more.

To do this use the command:

cockroach node decommission <nodeid> <nodeid> <nodeid>...

The cockroach node decommission command can do the work by connecting to any of the remaining nodes. Use --host and --port.

Does this help?


(Ivan Kresic) #6

Hi @knz,

not really, we wan’t to remove already decommissioned nodes from the list. I, for example, have 2 decommissioned nodes, one using a duplicate IP and port of a currently live node, and the other using an old IP, so I cannot connect to those nodes anymore. It’s just an aesthetic issue concerning the dashboard.


(Raphael 'kena' Poss) #7

Ivan,

Maybe there was a misunderstanding. I believe that if the node still appears in the list in the UI, that means the node is considered as dead (terminated) but not decommissionned.

The word “decommission” does not mean “stop the node” instead it means “remove the node from the list of nodes”. It is possible to decommission a node that is already stopped. Is it not what you want?


(Ivan Kresic) #8

Hi @knz,

I literally want to remove nodes from the UI list that are listed in the “decommissioned nodes” section. They are not listed as “dead”, but rather “decommissioned”. I just don’t want to see them in the list anymore for they are unavailable an non-existing for a long time.


(Jesse) #9

@kresic.ivan, we did make a change to remove decommissioned and dead nodes from timeseries graphs: https://github.com/cockroachdb/cockroach/issues/23110. And it looks like we intended to remove decommissioned and dead nodes from the nodes list page as well: https://github.com/cockroachdb/cockroach/issues/20639. However, I can’t understand if that work actually got done. From your experience, it seems like it didn’t.

@tschottdorf, @marc, do either of you know whether there’s a way to get dead and decommissioned nodes to stop appearing on the nodes list page?


#10

Exactly as @kresic.ivan says. And I guess since they’re in the UI they’re also still in the database somewhere.

As far as I can tell, they’re not causing any trouble, but it’s silly to have them there forever and not be able to do anything about it.


(Raphael 'kena' Poss) #11

All right now I understand better. Thanks for explaining.

There was a bug about the display, which I think I recently fixed: https://github.com/cockroachdb/cockroach/pull/26821. This will be available in crdb 2.1, hopefully you can test it in the July 30 alpha release.

Cheers


(Timothy Haggerty) #12

Am on v2.1.1, still see obliterated hosts under “Decommissioned Nodes”, one goes back 5 days.
Used and did not hang: cockroach node decommission <node_nbr>.
node status went from is_available=is_live=false to not showing up in the listing and moving from Live Nodes to Decommissioned Nodes.


(Raphael 'kena' Poss) #13

Hi Timothy,
thank you for your inquiry. For now decommissioned node will indeed remain in the UI, albeit just on that one screen. You can discuss this feature further here: https://github.com/cockroachdb/cockroach/issues/24636


Cockroach db Staggered Version resolve kubernetes
(Andrew Dona-Couch) #14

To follow up here with the same comment I made on the above-linked issue: that issue is strictly related to the display of nodes which have been decommissioned but the cluster still remembers. I have opened a new issue for the suggestion to let the cluster completely forget some nodes: https://github.com/cockroachdb/cockroach/issues/33542