Concurrent transactional counter with in-place (blind) updates

sql

(kbespalov) #1

Hi!

Is it possible to avoid transactions lock invalidation in simple counting query like that?

UPDATE visitors SET count +=1 WHERE counter_id=a

(Raphael 'kena' Poss) #2

No it’s not possible. (The fact there is no “lock” in CockroachDB, just transaction retries due to contention, notwithstanding.)

However you can also operate using either of the following two techniques:

  1. “de-normalized” counter:

    • in the common case make an “increment” using an INSERT (not UPDATE) of a new row with value count=1, and “read” the value of a counter using sum(count) where counter_id=a.
    • occasionally (eg one per hour/day), delete all the rows and add a new row with val equal to the sum.
  2. “distributed” counter:

    • make your counter table with a UUID primary key(randomly pseudo generated) and a secondary index on (counter_id, uuid).
    • make multiple rows with the same counter_id but differnt uuids
    • “read” the counter value with select sum(count) where counter_id=a
    • “update” the counter by first selecting a uuid (select uuid where counter_id=a) randomly, then update set count +=1 where counter_id=a and uuid=b

    If you combine this with range splitting so that the same counter has its value spread across different ranges, this will spread the write hotspot and reduce txn contention.

Whether these solutions are valid depends a bit on the balance between reads and writes and the performance you’re expecting. If reads of a single counter are relatively frequent compared to writes, this will likely not work well. (But then if that was true, you wouldn’t observe txn contention as much, hence my suggestion.)

Note however that there’s no free lunch: if you’re expecting an accurate counter with serializable isolation, you’re going to pay the price of these writes “all going to the same place” somehow, regardless of which SQL database you’re using. The way other SQL databases “cheat” on performance for this use case is to give up serializable isolation and providing eventual consistency instead.

I hope this helps? Let us know!