"pq: no inbound stream connection"

I’m just testing out CRDB for the first time. I have this question about a behavior.
First of all I have 2 machines, A and B. I start A with:
cockroach start --http-port=8084 --host=0.0.0.0 --advertise-host=10.48.19.20 --background --insecure
(The reason for 0.0.0.0 is that I want the DB to serve request from localhost, as well as from external clients. For purpose of clustering, the host is 10.48.19.20)
I then start B with:
cockroach start --http-port=8084 --host=0.0.0.0 --join=10.48.19.20 --insecure --background
(This is mostly the same as before, but I use “join” instead of “advertise-host” so that B can join A in one cluster).

At first everything looks good. They’re in a cluster (the Admin UI confirms that). I can create a DB, or a table in one node, and “show tables” in the other node will show it.

  • If I insert a row in A, I can see it in A (“select * from …”)
  • If I insert a row in B, I can also see it in A. So “INSERT” and “SHOW” work on both nodes.

However I cannot do “SELECT” in B. It will throw this error: " pq: no inbound stream connection". I also get this error when I type “\q” to quit the SQL shell in B.

The only way I can avoid this error and make things fully work is by binding B to a particular IP address:
cockroach start --http-port=8084 --host=10.48.19.16 --join=10.48.19.20 --insecure --background
However this means when connect from locally, I still have to specify the IP as if coming in from outside. I can no longer use “localhost” or loopback address.

I know it is a small price to pay. But I’m still interested in knowing why. A apparently had no problem binding to “0.0.0.0”, so why can’t B?

I thought about this a bit, and it looks like this is an issue of reachability.
When I start B with this command:
cockroach start --http-port=8084 --host=0.0.0.0 --join=10.48.19.20 --insecure --background

Then it will tells A that “my hostname is 0.0.0.0”. So when A needs to get back to B, it will use 0.0.0.0 as the destination, which will fail.
Why the failure only shows up from B’s side? I don’t know. I thought it should fail when trying to run “SELECT” from either side.

In any case, the way to get past this is to specify --advertise-host for B as well. Basically in a cluster, any time you have --host=0.0.0.0, you need to provided --advertise-host=[reachable IP of that node]

You’re correct about using --advertise-host in this way. It is needed for exactly these kinds of situations.

On K8s where currently using:
container:
command:
- “/bin/bash”
- “-ecx”
- “exec /cockroach/cockroach start --advertise-host (hostname).{STATEFULSET_NAME}…”

Any suggestions/recommendations on what to switch to that would solve this issue?

Hi @jimlambrt! We have a guide on how to set up CockroachDB on kubernetes, check it out! https://www.cockroachlabs.com/docs/stable/orchestrate-cockroachdb-with-kubernetes.html

thanks Dan… FYI, the - "exec /cockroach/cockroach start --host $(hostname -f) didn’t work for us. We could init the cluster, but every time we tried to connect, it was refused. That’s why we changed to --advertise-host (hostname).{STATEFULSET_NAME}

Hi, can you share what Kubernetes Services you created? Are you trying to connect from a pod or outside the cluster?

$ cat service.yaml 
##---
# Source: id-cockroachdb/templates/service.yaml
apiVersion: v1
kind: Service
metadata:
  # This service only exists to create DNS entries for each pod in the stateful
  # set such that they can resolve each other's IP addresses. It does not
  # create a load-balanced ClusterIP and should not be used directly by clientsPreformatted text
  # in most circumstances.
  name: "cockroachdb"
  namespace: "identity-dev"
  labels:
heritage: "helm-template"
release: "cockroachdb"
chart: "id-cockroachdb-0.2.1"
component: "cockroachdb"
backup: "backup1d-keep90d"
  annotations:
# This is needed to make the peer-finder work properly and to help avoid
# edge cases where instance 0 comes up after losing its data and needs to
# decide whether it should create a new cluster or try to join an existing
# one. If it creates a new cluster when it should have joined an existing
# one, we'd end up with two separate clusters listening at the same service
# endpoint, which would be very bad.
service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
# Enable automatic monitoring of all instances when Prometheus is running in the cluster.
prometheus.io/scrape: "true"
prometheus.io/path: "_status/vars"
prometheus.io/port: "8080"
spec:
  ports:
  - port: 26257
targetPort: 26257
name: grpc
  - port: 8080
targetPort: 8080
name: http
  clusterIP: None
  selector:
component: "cockroachdb"
jimlambrt@jims-mbp-4.bose.com ~/vagrant-projects/core-helm/id-cockroachdb/out/identity-dev/id-cockroachdb/templates [01:09:20] 
$ cat service-public.yaml 
##---
# Source: id-cockroachdb/templates/service-public.yaml
apiVersion: v1
kind: Service
metadata:
   #This service is meant to be used by clients of the database. It exposes a ClusterIP that will
   #automatically load balance connections to the different database pods.
  name: "cockroachdb-public"
  namespace: "identity-dev"
  labels:
heritage: "helm-template"
release: "cockroachdb"
chart: "id-cockroachdb-0.2.1"
component: "cockroachdb"
backup: "backup1d-keep90d"
spec:
  ports:
   #The main port, served by gRPC, serves Postgres-flavor SQL, internode
   #traffic and the cli.
  - port: 26257
targetPort: 26257
name: grpc
   #The secondary port serves the UI as well as health and debug endpoints.
  - port: 8080
targetPort: 8080
name: http
  selector:
component: "cockroachdb"

only from within the cluster

You noted that --host $(hostname -f) didnt work for you, but did you try --advertise-host $(hostname -f)? That’s what we use in our official stateful set config and that seems to be a discrepancy.

I’ll also check with others internally.

@nate is right, you want to use --advertise-host $(hostname -f) so that you’re guaranteed to get the full cluster DNS name for the pod.

Hum… that didn’t quite work for us. Seems to be some discrepancy in how our DNS is resolved. The (hostname).{STATEFULSET_NAME} is working. Time to go dig into some cluster DNS I guess. Ty

@jimlambrt can you clarify a couple things for me?

  • Are you working with @lexthang or is this a separate issue?
  • Can you share the full config file you used to set up the cluster?
  • Can you share the logs from one or more of the cockroach pods?