Detecting cause of serialization error


(Ivan Dubrov) #1

I’m getting serialization errors while running transactions in parallel (pretty big ones). Is there a way to get some insights on which ranges they are getting contention?


(Ron Arévalo) #2

Hi @idubrov,

We will be adding visualization for this down the road, but for now, we have a script that you can run, it can be found here.

You’ll want to do the following:

  1. Start Testing
  2. Pull the /_status/raft endpoint and save it to a file.
  3. Run that file against the hottest_ranges.py script.

Thanks,

Ron


(Ivan Dubrov) #3

I see the “hot” range, which have range of /Table/55 to /Table/56. I don’t know how to map that table id to the table name, but I think I know which table is that (we only write 3 tables :slight_smile: ).

I should have clarified a bit more. Here is the problem I’m observing:

  1. I run a load script that submits a some amount of data in parallel. I don’t really know, but I think this data should not create contention (it writes to different key ranges).
  2. However, I’m seeing some amount of serialization errors.
  3. I would be interested to know which key or key range caused these errors.
  4. We do a lot of statements in transactions, these transactions are somewhat heavy.

So, it’s not a high load scenario, but rather low load unexpected conflicts scenario with large transactions.

This is CRDB running locally, one node, not a real setup. Would be fine even to log every conflict or something like that. Is it possible?


(Ivan Dubrov) #4

Actually, forget about that. User error. Transactions were rolled back for another reason :smiley:

P.S. I would still be interested to know if there is a way to pinpoint conflicts.