Helm install with security is not working

(Jagan Kondapalli) #1

Hi - I’m trying to use stable helm charts to standup a 3 node cockroach cluster in k8s environment. 1.11.6 is the k8s version. I was successful in doing so without security and was able to log into the shell with --insecure.

I’m trying to spin up one more poc cluster with security. I tried this two different ways:

Pods are stuck in init state and there are no useful logs to troubleshoot the issue. Can you guys please help point me in the right direction?

(Jagan Kondapalli) #2

here are the events that I see when I describe one of the pods

  Type    Reason                  Age   From                                   Message
  ----    ------                  ----  ----                                   -------
  Normal  Scheduled               15m   default-scheduler                      Successfully assigned ck/ck-poc-cockroachdb-1 to ip-10-x-x-x.ec2.internal
  Normal  SuccessfulAttachVolume  15m   attachdetach-controller                AttachVolume.Attach succeeded for volume "pvc-u-u-i-d"
  Normal  Pulling                 14m   kubelet, ip-10-x-x-x.ec2.internal  pulling image "cockroachdb/cockroach-k8s-request-cert:0.4"
  Normal  Pulled                  14m   kubelet, ip-10-x-x-x.ec2.internal  Successfully pulled image "cockroachdb/cockroach-k8s-request-cert:0.4"
  Normal  Created                 14m   kubelet, ip-10-x-x-x.ec2.internal  Created container
  Normal  Started                 14m   kubelet, ip-10-x-x-x.ec2.internal  Started container```
(Ron Arévalo) #3

Hey @jkondapalli,

Can you run the following command

kubectl get csr

Just want to confirm if the CSRs have been approved, if they haven’t, you’ll see CONDITION= PENDING.

If you haven’t approved them, you’ll want to follow these steps, specifically the approval of CSRs which is part 4 of step 2. You’ll need to repeat those steps for each pod.

Thanks,

Ron

(Jagan Kondapalli) #4

I approved all the CSRs already and yet they are just stuck in Init State

ck.client.root                 32m   system:serviceaccount:ck:cockraochdb-sa   Approved
ck.node.ck-poc-cockroachdb-0   32m   system:serviceaccount:ck:cockraochdb-sa   Approved
ck.node.ck-poc-cockroachdb-1   32m   system:serviceaccount:ck:cockraochdb-sa   Approved
ck.node.ck-poc-cockroachdb-2   32m   system:serviceaccount:ck:cockraochdb-sa   Approved```
(Jagan Kondapalli) #5

here is the describe pod details

Namespace:          ck
Priority:           0
PriorityClassName:  <none>
Node:               ip-10-x-x-x.ec2.internal/10.x.x.x
Start Time:         Thu, 09 May 2019 08:48:53 -0500
Labels:             chart=cockroachdb-2.1.3
                    component=ck-poc-cockroachdb
                    controller-revision-hash=ck-poc-cockroachdb-5689f7d48f
                    heritage=Tiller
                    release=ck-poc
                    statefulset.kubernetes.io/pod-name=ck-poc-cockroachdb-1
Annotations:        cni.projectcalico.org/podIP: 10.x.x.x/32
Status:             Pending
IP:                 10.x.x.x
Controlled By:      StatefulSet/ck-poc-cockroachdb
Init Containers:
  init-certs:
    Container ID:  docker://0c895dc0455cb85e2a4f6e436d2323613bc9046338a2600252d3e5dbc8a84cf5
    Image:         cockroachdb/cockroach-k8s-request-cert:0.4
    Image ID:      docker-pullable://cockroachdb/cockroach-k8s-request-cert@sha256:d512bc05c482a1c164544e68299ff7616d4a26325ac9aa2c2ddce89bc241c792
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/ash
      -ecx
      /request-cert -namespace=${POD_NAMESPACE} -certs-dir=/cockroach-certs -type=node -addresses=localhost,127.0.0.1,$(hostname -f),$(hostname -f|cut -f 1-2 -d '.'),ck-poc-cockroachdb-public,ck-poc-cockroachdb-public.$(hostname -f|cut -f 3- -d '.') -symlink-ca-from=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    State:          Running
      Started:      Thu, 09 May 2019 08:49:03 -0500
    Ready:          False
    Restart Count:  0
    Environment:
      POD_NAMESPACE:  ck (v1:metadata.namespace)
    Mounts:
      /cockroach-certs from certs (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from cockraochdb-sa-token-9tdjp (ro)
Containers:
  ck-poc-cockroachdb:
    Container ID:  
    Image:         cockroachdb/cockroach:v19.1.0
    Image ID:      
    Ports:         26257/TCP, 8080/TCP
    Host Ports:    0/TCP, 0/TCP
    Command:
      /bin/bash
      -ecx
      exec /cockroach/cockroach start --logtostderr --certs-dir /cockroach/cockroach-certs --advertise-host $(hostname).${STATEFULSET_FQDN} --http-host 0.0.0.0 --http-port 8080 --port 26257 --cache 25% --max-sql-memory 25%  --join ${STATEFULSET_NAME}-0.${STATEFULSET_FQDN}:26257,${STATEFULSET_NAME}-1.${STATEFULSET_FQDN}:26257,${STATEFULSET_NAME}-2.${STATEFULSET_FQDN}:26257
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Liveness:       http-get https://:http/health delay=30s timeout=1s period=5s #success=1 #failure=3
    Readiness:      http-get https://:http/health%3Fready=1 delay=10s timeout=1s period=5s #success=1 #failure=2
    Environment:
      STATEFULSET_NAME:   ck-poc-cockroachdb
      STATEFULSET_FQDN:   ck-poc-cockroachdb.ck.svc.cluster.local
      COCKROACH_CHANNEL:  kubernetes-helm
    Mounts:
      /cockroach/cockroach-certs from certs (rw)
      /cockroach/cockroach-data from datadir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from cockraochdb-sa-token-9tdjp (ro)
Conditions:
  Type              Status
  Initialized       False 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  datadir:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  datadir-ck-poc-cockroachdb-1
    ReadOnly:   false
  certs:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  cockraochdb-sa-token-9tdjp:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  cockraochdb-sa-token-9tdjp
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason                  Age   From                                   Message
  ----    ------                  ----  ----                                   -------
  Normal  Scheduled               34m   default-scheduler                      Successfully assigned ck/ck-poc-cockroachdb-1 to ip-10-x-x-x.ec2.internal
  Normal  SuccessfulAttachVolume  33m   attachdetach-controller                AttachVolume.Attach succeeded for volume "pvc-u-u-i-d"
  Normal  Pulling                 33m   kubelet, ip-10-x-x-x.ec2.internal  pulling image "cockroachdb/cockroach-k8s-request-cert:0.4"
  Normal  Pulled                  33m   kubelet, ip-10-x-x-x.ec2.internal  Successfully pulled image "cockroachdb/cockroach-k8s-request-cert:0.4"
  Normal  Created                 33m   kubelet, ip-10-x-x-x.ec2.internal  Created container
  Normal  Started                 33m   kubelet, ip-10-x-x-x.ec2.internal  Started container```
(Ron Arévalo) #6

Hey @jkondapalli,

Can you provide us the logs from you init container, here’s the command.

Thanks,

Ron

(Jagan Kondapalli) #7

So, in the init pod, I greped logs out of init-certs container. In spite of approved certificates, I see logs lines like below:

2019-05-09 14:23:19.629631157 +0000 UTC m=+2064.979724560: waiting for 'kubectl certificate approve ck.client.root'
2019-05-09 14:23:49.629977867 +0000 UTC m=+2094.980073787: waiting for 'kubectl certificate approve ck.client.root'
2019-05-09 14:24:19.630575071 +0000 UTC m=+2124.980668057: waiting for 'kubectl certificate approve ck.client.root'```
(Ron Arévalo) #8

Hey @jkondapalli,

Thanks for the logs, you can ignore my last message asking for them.

(Jagan Kondapalli) #9

@ronarev - looks like this might be our issue

Reasons:

2019/05/09 15:17:40 Looking up cert and key under secret ck.client.root
W0509 15:17:40.769940       1 client_config.go:529] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
2019/05/09 15:17:40 Secret ck.client.root not found, sending CSR
Sending create request: ck.client.root for 
Request sent, waiting for approval. To approve, run 'kubectl certificate approve ck.client.root'
CSR approved, but no certificate in response. Waiting some more
2019-05-09 15:18:31.006957856 +0000 UTC m=+50.241427581: waiting for 'kubectl certificate approve ck.client.root'

how do I set this during helm install?
I thought RBACs should take care of this?

(Ron Arévalo) #10

Hey @jkondapalli,

Looks like there is no certificate signer configured. Most provisioning systems enable this by default.

How are you running k8s? Did you create the cluster yourself or did you use GKE/EKS/etc. ?

Thanks,

Ron

(Jagan Kondapalli) #11

It is a rancher managed k8s cluster

(Ron Arévalo) #12

Hey @jkondapalli,

If you’re using Rancher 2.0 it looks like this will solve your issue.

Thanks,

Ron

(Jagan Kondapalli) #13

@ronarev - thank you for your help on this. You were right. We added the kube-controller extra_args and that let the init pod to get Approved & Issued certs.

Again, Thank you for your help!