DELETE statement hangs indefinitely with automatic retries

Hi All,

We are exploring CockroachDB in a 3 node Kubernetes cluster. Each has 2 CPU, 3GB memory.

I’m very confused with DELETE statement performance and probably doing something wrong.

On a table of about 7 millions record it can take an hour to delete 25K rows (attempt on the screenshot below). Biggest try was 5 hours running statement after which I just restarted the cluster.

What seems off to me is why one statement on idle cluster can’t manage its execution without locks and retries. And why there’s no limit on number of automatic retries.

I have a trace to share
This time it took 236840.902ms to delete 10K rows. In the trace you can see a lot of locks and retries.

Is this normal behavior for CockroachDB? Anything I’m doing wrong?

Kind Regards.

An update here.

To prevent CRDB from indefinite retries I had to use explicit transactions. That way I get an error after first failed attempt.

The reason for the failure seems to be something else, internal, writing to the same table. My statement was the only one from external clients on idle cluster. I’m not sure how to confirm but it may be range split/merge operations that won’t let me update any significant number of records at a time.

After exploring this more I learned that on my setup I’m unable to update/delete more than a few thousand records at the time when something else is writing to the table. When I make sure I’m the only one writing the table I can basically update as much as CRDB is able to process in 5 minutes which is about 2 millions rows for me.

I guess my updated question is the same. This kind of write performance and sensitivity to concurrent updates is normal for Cockroach?

Hi Max
I am sorry to hear about your issue. Delete performance on most relational databases in my experience is not great when you start to delete large amounts of data. Best practice therefore is to do the following

  1. Create a table as select what you dont want to delete
  2. Drop the original table
  3. Rename the temp table created in step one to be the original table.

Be careful with this, cockroach looses column definitions, constraints and indexes in this process so make sure you have tested this fully on a dev environment and have created the script to fully recreate the original table.
We are continuing to work on improving the performance of delete at cockroach but I would recommend this process on any relational database that I have worked on in the past 20+ years to help execute a large delete…

I hope this helps