"select ...limit 0" can't get any response

version: v2.0.6
cluster: 9 nodes

Sometimes ,ran the command "select 1 from db.tb limit 0;"can’t get any repone.Just like the cluster was blocking.I even tried to wait one hour,but still can’t get any result.Can any one tell me how to do that in this situation.

Hey @edgar,

Do you see any errors in the logs? Does the overall health of the cluster seem to be degrading?

If you try running something like EXPLAIN select 1 from db.tb limit 0; do you get a result? Also, are you running this from an app, or directly from the built in sql client?

Thanks,

Ron

Use the cockroach built in sql client.Next time if it happen again.I’ll try to do what you said.

Hei @ronarev.

It happen again. And I ran “EXPLAIN select 1 from db.tb limit 0;”. It just like was blocking and I got nothing.And I cann’t define the cluster health degrading or not.At the sametime .I cann’t see any error in the logs.One warning and all else are informations.

W190624 04:07:17.804468 9655023881 vendor/google.golang.org/grpc/server.go:625  grpc: Server.Serve failed to create ServerTransport: connection error: desc = "transport: http2Server.HandleStreams failed to receive the preface from client: EOF"
I190624 04:07:18.598920 4977 server/status/runtime.go:219  [n5] runtime stats: 71 GiB RSS, 522 goroutines, 2.8 GiB/2.7 GiB/8.5 GiB GO alloc/idle/total, 51 GiB/64 GiB CGO alloc/total, 153387.14cgo/sec, 4.12/0.26 %(u/s)time, 0.00 %gc (1x)
I190624 04:07:18.628087 6406 storage/replica_proposal.go:202  [n5,s9,r120737/5:/Table/3188/1/9{49364…-66183…}] new range lease repl=(n5,s9):5 seq=0 start=1561349238.419211483,0 epo=30 pro=1561349238.419231850,0 following repl=(n2,s2):2 seq=0 start=1560659630.300843753,0 epo=80 pro=1560659630.300846603,0

Hey @edgar,

It would be great if you could send over the entire debug zip. If it isn’t too large you could email it to me here. If it is too large, let me know and I’ll create a private drive folder for you to upload it to.

Thanks,

Hei @ronarev,

The zip file is too large,it’s more than 200 MB.Please give me another way.I will upload it to you.

Thanks a lot.

Hey @ronarev

The debug zip file was uploaded.Anything else.Please let me know.

Thanks.

Hey @edgar,

Can you send over the network latency matrix, that can be found at <HOST>:<PORT>/#/reports/network.

Just want to make sure that all nodes can talk to each other, based on the logs, it does seem that there are some connection errors, but nothing besides that.

Are you still experiencing the problem?

Also it might be worth noting that you upgrading to the latest release 19.1 would provide greater stability.

Thanks,

Ron

Hey @ronarev

Look at the information below.The node 9 and the node 10 are the same ip.If you can see anything about the node 9 in the zip file.You can ignore it.

+----+---------------------+--------+---------------------+---------------------+---------+
| id |       address       | build  |     updated_at      |     started_at      | is_live |
+----+---------------------+--------+---------------------+---------------------+---------+
|  1 | 172.18.151.38:3359  | v2.0.6 | 2019-06-27 01:24:57 | 2018-10-31 05:01:36 | true    |
|  2 | 172.18.150.200:3359 | v2.0.6 | 2019-06-27 01:24:51 | 2018-10-31 04:21:40 | true    |
|  3 | 172.18.150.238:3359 | v2.0.6 | 2019-06-27 01:24:53 | 2018-10-31 04:56:12 | true    |
|  4 | 172.18.150.201:3359 | v2.0.6 | 2019-06-27 01:24:47 | 2018-10-31 04:40:46 | true    |
|  5 | 172.18.150.207:3359 | v2.0.6 | 2019-06-27 01:24:49 | 2018-10-31 04:36:08 | true    |
|  6 | 172.18.151.13:3359  | v2.0.6 | 2019-06-27 01:24:50 | 2018-10-31 04:45:59 | true    |
|  7 | 172.18.150.173:3359 | v2.0.6 | 2019-06-27 01:24:57 | 2019-05-04 11:46:42 | true    |
|  8 | 172.18.150.197:3359 | v2.0.6 | 2019-06-27 01:24:42 | 2018-10-31 04:27:19 | true    |
| 10 | 172.18.150.167:3359 | v2.0.6 | 2019-06-27 01:24:48 | 2019-06-04 01:20:54 | true    |
+----+---------------------+--------+---------------------+---------------------+---------+

This situation happens so many times, is it related to the parameter of cluster settings?

Hi @edgar,

Could you send over the network latency matrix, that can be found at <HOST>:<PORT>/#/reports/network ? This would give us a better sense as to if this is a network issue or something else.

Also, could you set tracing on and run the select query again? Let it run for at least a minute or more.

We have docs on how to set tracing on here.

It would also be worth noting, that you may wish to upgrade to a newer more stable version of Cockroach DB.

Thanks,

Ron