We’re good, thanks!
CRDB does not implement the techniques described in that paper. However, we do achieve (at least to some degree) the objectives of that work, and the timestamps of our transactions are somewhat dynamic. Throughout its lifetime, a transaction maintains a read timestamp and a write timestamp. The read ts is the ts at which the transaction reads data, the write ts is the timestamp at which it writes. If the txn ultimately commits, it commits at the write timestamp - meaning that the MVCC records it persists are marked with the write timestamp, and so that is the timestamp that dictates the transaction’s serial ordering with respect to other transactions. The SERIALIZABLE transaction isolation (which CRDB implements) effectively dictates that transactions only commit if their read timestamp is equal to its write timestamp. So, although we allow the two timestamps to temporarily diverge, a txn is only allowed to commit if we can collapse them back together. Here’s how it works:
The two timestamps start up as the same. When a txn runs into a Read-Write conflict (i.e. it tries to write at timestamp t1 a key that has already been read at timestamp t2 >= t1), the write timestamp advances past t2 (and the read timestamp stays as it was). Similarly on a Write-Write conflict (i.e. when trying to write below a more recent write).
Quick aside: what does it really mean for a txn to have read timestamp 100 and write timestamp 200? Well, it means that it’s operating on a certain snapshot of the data (dictated by timestamp 100), but writing on a “different snapshot”, allowing gaps between the two. It allows for an anomaly called “write skew” - imagine two transactions reading @100, the first one writing at 150 and the second one writing @200. The second one is producing its changes without taking into account the first one’s changes, and that can be a violation of serializability. Imagine a hospital shift scheduling system where a doctor can take a day of vacation if there is at least one other doctor that is not on vacation on the same day. Now, if you allow write skew, then two doctors trying to schedule their vacation can both read the same calendar, see that the other one is not on vacation, and both proceed to schedule their vacation. That’s no good.
On various occasions (at the very least, just before committing), CRDB tries to reconcile the read timestamp and the write timestamp by checking if the read ts can be moved forward to the write ts. The read ts can be moved from 100 to 200 if everything the txn read is identical @100 and at @200. This is implemented by a) keeping track of all the key spans that a txn has read and b) checking if there has been any intervening write on those key spans with a ts between the read ts and the write ts. We call this verification “refreshing the read spans”. If it is successful, the read timestamp is ratcheted up, and the txn is allowed to commit. If not successful, the transaction needs to restart and perform all its reads (and generally all its logic) again.
Does that make sense? For more details, you can check our recent SIGMOD paper.