Transaction aborts

Hey, so I have an application design whereby for specific partitions of data I only want 1 process to able to write to the database. So to enforce this I have created a locking table:

CREATE TABLE partition_lock (
partition_id INT8 PRIMARY KEY,
last_command INT8 NOT NULL
);

The situation is I want to use the transaction locking to control a single process updating to this table:

  1. A process P1 starts and updates the last_command value to 1 for partition_id=1
  2. P1 then starts the next command and as part of it’s transaction again updates the last_command value to 2 for partition_id=1, however it get stuck or takes a long time for some reason.
  3. A new process P2 is created and begins processing command 2, so it also tries to update the last_command value to 2 for partition_id=1.

So we now have a competing update transaction. I would like P1 to terminate, and so as part of the activation of a process the first command it processing is done with a transaction priority of high, and so P1 will get a transaction abort exception, at which point P1 self terminate. This is the behaviour I am expected and am happy with.

However, the situation whereby P1 gets stuck performing it’s initial, high priority command transaction, and then P2 also does a high priority command transaction, the transactions just seem to sit there seemingly forever.

I would like to know what I can do to ensure that when 2 transactions are competing at the same priority, one of them is aborted.

Thanks

Hi Mark,

If I understand correctly, you’d like to use cockroach as a distributed lock service for your job processing system. An invariant you’d like to uphold is that only process ever successfully writes the result of a job to the database. You’d also like it to be possible for a new worker to take over running a job that has already been started by a another worker. In this case, it’d be ideal if the original worker could (if it were still alive) detect this preemption and exit.

If I understand correctly, the approach you’re considering to acheive this goal is based on each worker holding open a transaction for the duration of the work and at the very beginning reading the lock table and then updating it. After the work completes, the original worker can write the outcome to other tables and then commit the transaction. The commit may fail if another transaction (from a new worker) is able to update the lock table at the same time. It sounds from your question like you’d also like it to be the case that if a later transaction is able to push the earlier one (say because it has a higher priority) then the original transaction would be notified that it has been aborted.

There are certainly cases where the earlier transaction will be notified of its failure immediately but its a tenuous mechanism to rely on directly. Furthermore, holding open long running transactions can cause other problems.

Fortunately I think your goals can be acheived using separate transactions and the new EXPERIMENTAL CHANGEFEED FOR feature in 19.1. With this feature you can watch for changes on a row. Imagine code that writes to a lock table when it begins and it commits. Then it watches on that table for somebody else to remove its lock to cancel its work. Then, when later it goes to write anything to the database, in the same transaction it reads the lock table to ensure that it still holds the lock.

Do I understand your use case correctly?

Best,
Andrew