Cannot load certificates

I have a cockroach cluster running in kubernetes and now I have a strange situation.
If I ssh into one of my nodes I can list the status of the nodes with the command:

[root@test-worker-1 cockroach]# ./cockroach node status --certs-dir cockroach-certs
Enter password: 
  id |       address       |     sql_address     |  build  |            started_at            |            updated_at            | locality | is_available | is_live
-----+---------------------+---------------------+---------+----------------------------------+----------------------------------+----------+--------------+----------
   1 | test-worker-1:26257 | test-worker-1:26257 | v20.2.7 | 2021-04-23 15:02:49.457054+00:00 | 2021-04-23 19:55:01.52308+00:00  |          | true         | true
   2 | test-worker-3:26257 | test-worker-3:26257 | v20.2.7 | 2021-04-23 15:02:49.922554+00:00 | 2021-04-23 19:55:01.973917+00:00 |          | true         | true
   3 | test-worker-2:26257 | test-worker-2:26257 | v20.2.7 | 2021-04-23 17:13:28.077606+00:00 | 2021-04-23 19:55:01.126078+00:00 |          | true         | true

I am asked for the password for the user root (which I have changed) and than I can see the list of nodes.

Now I wanted to try deleting one node with the following command:

[root@test-worker-1 cockroach]# ./cockroach node decommission --self --certs-dir /cockroach/cockroach-certs --host=test-worker-2
ERROR: cannot load certificates.
Check your certificate settings, set --certs-dir, or use --insecure for insecure clusters.

failed to connect to the node: problem with client cert for user root: not found
Failed running "node decommission"

But this command failed with the message that the command can not load the certificates.

Why is this happen? What is the different here to the status command?

Thanks for any hints

===
Ralph

Hi Ralph, for the command to decommission a node, is the absolute path for the cert correct?

The first command you used a relative path for the cert --certs-dir cockroach-certs and the second you used an absolute path --certs-dir /cockroach/cockroach-certs.

Yes, in the meantime I figured out that I always should run the commands from the POD cockroachdb-client and not form a cluster member POD.

So I use now the following command to ssh into the POD:

$ kubectl exec -it -n cockroach cockroachdb-client-secure -- bash

and from there, to get the status I run:

$ cockroach node status --certs-dir=/cockroach-certs --host=10.0.0.3

where 10.0.0.3 is the IP from one of my cluster nodes. And so I can call any of the cockroach sub commands now.

This was not totally clear for me after reading the documentation. I know that this is not easy to document as Kuberentes is only one possible runtime environment for cockroachdb.

I suggest that on this page:
https://www.cockroachlabs.com/docs/v20.2/orchestrate-cockroachdb-with-kubernetes

a separate step between 2 and 3 should be added ‘Installing the Cockroach Client’. This can be helpful to give the user the understanding, that the client with its client root certificate is very important for later administrating.

Glad that you were able to resolve the issue. I’ll pass on your feedback to the docs team. Thanks!