Hi!
Problem: init container not in “Completed” state
I tried to use localities functionality.
DC == data center == locality
DC1: 3 crdb nodes
DC2: 3 crdb nodes
DC3: 3 crdb nodes
All DCs are running in same lab in k8s, in three different namespaces.
All nodes try to join to:
dc1node 0, 1, 2
dc2node 0
dc3node 0
Step 1: DC1 was started up. “tampere”
Step 2: DC2 was started up. “oslo”
Init container in “oslo” does not end up in “Completed” state:
`oslo crdboslo-cockroachdb-0 1/1 Running
0 4m23s 192.168.224.91 neo0025node11 <none> <none>`
`oslo crdboslo-cockroachdb-1 1/1 Running
0 4m23s 192.168.184.215 neo0025node07 <none> <none>`
`oslo crdboslo-cockroachdb-2 1/1 Running
0 4m23s 192.168.190.83 neo0025node05 <none> <none>`
`oslo crdboslo-cockroachdb-**init**-r97bc 1/1 **Running**
0 4m23s 192.168.215.17 neo0025node09 <none> <none>`
DC1 is ok:
`tampere crdbtampere-cockroachdb-0 1/1 Running
0 5m12s 192.168.99.30 neo0025node12 <none> <none>`
`tampere crdbtampere-cockroachdb-1 1/1 Running
0 5m12s 192.168.113.218 neo0025node06 <none> <none>`
`tampere crdbtampere-cockroachdb-2 1/1 Running
0 5m12s 192.168.79.210 neo0025node08 <none> <none>`
`tampere crdbtampere-cockroachdb-init-hwmn5 0/1 **Completed**
0 5m12s 192.168.224.90 neo0025node11 <none> <none>`
Error message in init container of “oslo” locality:
+ sleep 5
+ /cockroach/cockroach init --insecure --host=crdbwarsaw-cockroachdb-0.crdbwarsaw-cockroachdb --port 26257
*
* ERROR: rpc error: code = Unknown desc = already connected to cluster
*
E191007 06:15:36.579453 1 cli/error.go:229 rpc error: code = Unknown desc = already connected to cluster
Error: rpc error: code = Unknown desc = already connected to cluster
Failed running "init"
+ sleep 5
+ /cockroach/cockroach init --insecure --host=crdbwarsaw-cockroachdb-0.crdbwarsaw-cockroachdb --port 26257
*
* ERROR: rpc error: code = Unknown desc = already connected to cluster
*
E191007 06:15:41.644741 1 cli/error.go:229 rpc error: code = Unknown desc = already connected to cluster
Error: rpc error: code = Unknown desc = already connected to cluster
Failed running "init"
+ sleep 5
The nodes are joined together successfully:
kubectl exec -it crdbtampere-cockroachdb-0 -ntampere – ./cockroach node status --insecure
id | address | build | started_at | updated_at | is_available | is_live
±—±----------------------------------------------------------------------------------±--------±---------------------------------±---------------------------------±-------------±--------+
1 | crdbtampere-cockroachdb-0.crdbtampere-cockroachdb.tampere.svc.cluster.local:26257 | v19.1.5 | 2019-10-07 06:05:22.639332+00:00 | 2019-10-07 06:13:19.695771+00:00 | true | true
2 | crdbtampere-cockroachdb-1.crdbtampere-cockroachdb.tampere.svc.cluster.local:26257 | v19.1.5 | 2019-10-07 06:05:26.15803+00:00 | 2019-10-07 06:13:18.713603+00:00 | true | true
3 | crdbtampere-cockroachdb-2.crdbtampere-cockroachdb.tampere.svc.cluster.local:26257 | v19.1.5 | 2019-10-07 06:05:30.007338+00:00 | 2019-10-07 06:13:18.06478+00:00 | true | true
4 | crdboslo-cockroachdb-1.crdboslo-cockroachdb.oslo.svc.cluster.local:26257 | v19.1.5 | 2019-10-07 06:06:06.457908+00:00 | 2019-10-07 06:13:18.473494+00:00 | true | true
5 | crdboslo-cockroachdb-2.crdboslo-cockroachdb.oslo.svc.cluster.local:26257 | v19.1.5 | 2019-10-07 06:06:08.337818+00:00 | 2019-10-07 06:13:15.898984+00:00 | true | true
6 | crdboslo-cockroachdb-0.crdboslo-cockroachdb.oslo.svc.cluster.local:26257 | v19.1.5 | 2019-10-07 06:06:11.560959+00:00 | 2019-10-07 06:13:19.078033+00:00 | true | true
So, CockroachDB is working ok otherwise.
Same happens with third locality (“warsaw”).
join command looks like following, e.g. in crdboslo-cockroachdb-0:
+ exec /cockroach/cockroach start --logtostderr --insecure --advertise-host crdboslo-cockroachdb-0.crdboslo-cockroachdb.oslo.svc.cluster.local --http-host 0.0.0.0 --http-port 8080 --port 26257 --cache 25% --max-sql-memory 25% --locality=datacenter=oslo --join crdbtampere-cockroachdb-0-svc.tampere.svc.cluster.local:26257,crdboslo-cockroachdb-0-svc.oslo.svc.cluster.local:26257,crdbwarsaw-cockroachdb-0-svc.warsaw.svc.cluster.local:26257
Version: 19.1.5