Not all workers are being hit for sql queries

I am running a cluster of 5 workers on gke with statefulset on host networks.
Sometimes from the admin dashboard, I see only 3-4 workers has non zero service latencies, sometimes it becomes 5, like following:

Wondering if this is a bug

Hi Jiale,

We do have some suspected bugs in our graphs; this one in particular seems to match your description, where some graphs are unexpectedly zero-valued.

However, I believe in that bug the tooltips disagree with the graph line; in the screenshot you posted, the zero value appears to agree with the graph line. That being the case, I think it’s possible that there are no SQL queries reaching those two nodes (which would result in a 0ns latency).

One way to investigate this would be to select one of the nodes with zero latency in the “Graph: Cluster” dropdown on the overview page, which will constrain the graph to that single node. Next, compare the SQL queries count graph to the SQL latency graph; if there are zero SQL queries reaching that node, then you would also expect zero values for latency.

Thanks,
Matt

How many clients are connected to it? If there are only a few clients, it’s very possible that Kubernetes’s service routing connected them all to the same few statefulset pods.

The SQL dashboard’s top graph shows the number of connections to the system. If you look at the per-node views as Matt suggested, you should be able to get a breakdown of how many clients are connected to each pod.