Updating an arbitrary row matching a condition

I was reading through Quick and Easy, Exactly-Once, Distributed Work Queues Using Serializable Transactions - DEV Community (excellent article, btw) because it was the closest thing I’d been able to find to my desired use-case. What I want is a bit different though, so I figured I’d ask if there was a different best-practice approach.

I have a table where each row corresponds to an item that can be acquired by an owner. Something like this:

id INT,
flavor STRING,
owner INT,

My workers need to be able to acquire an unowned item of a particular flavor and a NULL owner. It does this by writing in an owner id into the owner column. This is pretty easy to do with something like this:

UPDATE items SET owner = X WHERE owner IS NULL and flavor = Y LIMIT 1 RETURNING id;

An issue here is that it seems like this update ends up locking the entire range of eligible items not just the one that’s actually written to. I know because the order matters to the db here (even if it doesn’t to me) it makes sense that it wouldn’t want anything else inserting something before the row it’s going to return, although I am a bit surprised that it ends up locking everything after too.

The other issue is that workers doing updates in parallel are going to all be contending for the first eligible row, whereas I really don’t care which they pick. This is what makes it different from the message queue example, where getting the first one does matter. From looking around it seems like some other dbs have a SKIP LOCKED which would address this, but isn’t implemented in crdb right now.

Is there a best-practice way to approach this?

Is the idea that you’d like to not commit that UPDATE until you do more work on the same transaction? That is definitely not well supported today in cockroach. If you want to just go ahead and commit that claiming transaction, that seems like it can be ~fine. It won’t be particularly high throughput, but it’ll be something. If the IDs are uniformly distributed, one option could be to have the claiming transactions start at a random place. If there’s a lot to claim, this should allow you to avoid some contention.

    id UUID,
    flavor STRING,
    owner INT,
    PRIMARY KEY (flavor, id) -- important for avoiding contention between flavors
  FROM (
         SELECT * FROM items WHERE id > $1
         UNION ALL SELECT * FROM items WHERE id <= $1
 WHERE owner IS NULL AND flavor = $2

I’ll admit it’s not optimal in terms of throughput, but, then again, if throughput matters, then just locking individual rows at all is probably not the most ideal thing.

At the end of the day, you are getting at something rather deep. Building queuing on top of the database is super valuable and also super tricky to get right. We’ve been thinking increasingly that this is an area where we need to provide opinionated frameworks to make this simple. If you can provide insights you can provide on your use case in terms of scales, latency, topologies of producers, consumers and database, language of client implementation, size of messages, processing duration etc, that would be super helpful to help us guide our thinking. If you feel comfortable, sharing more about use cases at a higher level will also be helpful.

1 Like

Interesting. That query will certainly shuffle up what the first row is, but does it actually help with the locking situation? Maybe I don’t understand the impact of the subquery here.

In this situation there is no actual work - I was only comparing the work/message queue example because it was the closest thing I found. It’s merely recording ownership of items from a pool of resources, so doing the update is all the work that is required. The issue is that we are worried about optimizing performance here, hence my probable preoccupation on locks.

Let me do the best I can to answer your questions. The load profile here is that usually items are acquired rather slowly, but then sometimes a bunch of owners want to switch to a different “flavor”, letting their old ownerships expire and grabbing new ones. To be clear the “flavor” of an item never changes and the set of items is fixed, the owner just releases their current item and grabs another that matches their new preference. This makes the traffic very spiky. Even still, I’d call good performance here something like 100qps on the acquisitions out of O(million) items, with roughly O(10k) items per flavor. We aren’t super latency sensitive, so something like O(100ms) would be fine.

We are multiregion over three regions, clients are implemented in C++, size of the row (is that what you mean by message size?) is very small, and as I mentioned above there is no processing time outside the database operation itself.

Was that helpful?