Large record sizes crashes node (Kubernetes)

CockroachDB : 19.2
Kubernetes : 1.15.3 (Run on top of Ubuntu 18.04.3 LTS)

While I know that CockroachDB is not supposed to handle large record sizes, and storing files inside CockroachDB is frowned upon. But I have come across a possible bug. As I understand large record sizes would only impact performance, not bring down the entire node. What we have done is build a file store inside CockroachDB by splitting up the files into 100k chunks and then storing each chunk in a record. Since most of the files that we would store would be smaller than 100k it does seem like a solution that will work, only every now and then would we possibly store files larger, say 5 - 12MB. There is only one table called file and it looks like so:

CREATE TABLE file
(
    account_id INT8 NOT NULL,
    user_id INT8 NOT NULL,
    path STRING NOT NULL,
    "offset" INT8 NOT NULL,
    data BYTES NULL,
    size INT8 NOT NULL,
    CONSTRAINT "primary" PRIMARY KEY (account_id ASC, user_id ASC, path ASC, "offset" ASC),
    FAMILY "primary" (account_id, user_id, path, "offset", data, size)
)

We are using PHP 7’s pg_XXX commands to push and pull data out of this table and DBeaver to administrate it. The cluster is very active with ± 100 records constantly flowing into other tables (NOT this one) every 10 seconds. And all is healthy and running well for months now, until we start to insert data into this table “file”. Sometimes it works, and sometimes it doesn’t. Sometimes it crashes while we are inserting data and other times it crashes when we are reading data from this table. Sometimes it’s when we interact with it using PHP and other times it crashes when we use DBeaver. We have also seen that increasing the chunk size larger, to let’s say 500k does make it crash sooner. So 100k does allow us to push and pull ‘files’ into this table a couple of times (about 10 times) then it would crash. But lowering the amount below 100k to lets say 10k does start making the entire solution less feasible. Because storing a 12MB file in 10k chunks would result in 1,200 records being spawned. So managing it becomes clunky.

When it does crash the node goes offline completely. But it does seem to ONLY crash the node that we are connecting to (via Kubernetes’ external IP). So I am not 100% sure if this is a Kubernetes bug or a CockroachDB Bug. But it does seem like the network simply goes away and that node is no longer able to communicate with the rest of the cluster. The only recovery after that is to restart the entire server that node lives on. And there are still 80% RAM available before, during and after the crash (±6GB Free). Here are some logs:

Everything is still all fine and dandy over here:

Then this happens:

And then this:

Since the node crashes completely I am not able to do log dumps via Kubernetes, so that’s why I could only get screenshots.

Any idea what could be causing this? Or what other logs I can look at to see if I can find a problem?

Hey @steven,

Looks like you can get the logs from the previous pod by running the commands from this doc.

If you could get those logs as well as running cockroach debug zip, and send us the debug zip, you could attach those logs here.

Thanks,

Ron

Thank you Ron!

I have sent you the logs and debug zip files. I have also in the meantime tried the following but it still does not work:

I have also tried upgrading to the latest kubectl, kubeadm and kubelet versions
I have also tried upgrading to cockroachdb 19.2.1

I have discovered something very very strange. There are a total of 6 nodes in the Kubernetes cluster. And 5 of them are serving port 26257, if I connect my PHP that injects the records to node 4 then it works perfectly. But when I connect to node 5 it crashes after 1 or 2 tries. I have tested this consistently and it appears to be the case. All nodes are identical hardware and software setups. Also note that it only seems to be the update of the records that the issue, if I read the records I can read from node 4 or 5’s ip address and everything works perfectly.

I have also noticed that if I remove the kube-router pod on the same node as the cockroach pod (Kubernetes would re-initialize kube-router) then after a minute or two then the cockroach pod recovers. Other than that the only other solution to recover after the crash is to restart the entire node.

Attached is a kubectl describe pod dump of the cockroach pod after the crash

Appologies that it’s a PNG. I was not allowed to paste the result here because link spamming was detected, and attaching text files is also not allowed.

Ok! I seemed to have resolved the issue! Finally after almost a week! It was in fact a Kubernetes problem and NOT a CockroachDB problem. But for some odd reason it seems to only have affected CockroachDB an not any other services. Even when the other services were put under heavy load. And even when we insert 40 million records per day it still wasn’t enough load to trigger the issue. Only larger records 50kb and bigger was able to trigger the crash.

It seems like kubelet (the app that runs on each node that manages Kubernetes) auto updated to the latest version 1.16.3. While the API / Control Plane server didn’t auto update (since you have to issue the kubectl upgrade apply command). While kubelet is supposed to be backwards compatible with older versions of the API / Control Plane server it seems like this wasn’t the case. It would work in general, but seems to crash under heavy network load. Most likely some network / iptables issue I would presume.

But now that everything is running 1.16.3 everything seems healthy and all works as expected!

Hey @steven,

Glad you were able to find a resolution!

1 Like