Read-only transactions, consistency, and rollback

Suppose you want a transaction with these properties:

  1. The transaction allows changes, and within the transaction you read back any changes you have made.
  2. The transaction is read-only, in the sense that you can make changes but any changes made during the transaction are discarded when the transaction ends.
  3. The transaction presents an up-to-date, fully consistent view of the database.

The traditional way to get this in SQL is to create the transaction with the necessary isolation level (e.g. SERIALIZABLE), do whatever including possibly making changes, and then rollback the transaction instead of committing it.

You get the consistency of #3 pretty much automatically from lock-based databases, and from databases that are not clustered.

But is it possible to get such a transaction with CockroachDB?

The reason I ask is that in some distributed databases (e.g., FoundationDB was one), the data read from a transaction is not guaranteed to have been up-to-date (even in a read-only transaction) unless the transaction was committed and the commit() operation completed successfully.

But in SQL, I don’t know of any way to say “commit this transaction but discard all the changes”. This implies that either CockroachDB doesn’t support this kind of transaction, or else even a transaction that has gets rolled back guarantees up-to-date views.

Which is it?

Thanks.

In cockroachdb, transactions are assigned a timestamp. If you do a read, followed by a write, and then run a read again on the data just written, even if you haven’t yet committed the transaction, you will see the latest written data originating from the same transaction (that hasn’t yet committed).

This is a subtle issue. Officially, CockroachDB only guarantees that you will see consistent results if your transaction successfully commits, so you should not use results read in transactions that were abandoned.

Currently, data from an uncommitted transaction will be consistent in certain situations, although I’m not sure whether we want to document these as guarantees or if they’re just implementation details. Read-only transactions always see a consistent view even if they don’t commit (and this seems unlikely to change, although I suppose it’s possible we make some changes in the future to clock uncertainty-based restarts that would affect this). Read-write transactions like the ones you’re talking about, though, are trickier. We go to some lengths to avoid exposing inconsistent views of the data in this case (see https://github.com/cockroachdb/cockroach/issues/2231 and the issues linked to it), but I’m unsure if that is comprehensive enough that you could rely on it.

What are you really trying to do? Staging writes that you do not intend to commit is expensive and can limit your concurrency. There may be a better way to achieve your goals here. It would be architecturally possible for us to add some sort of “can this transaction commit?” command that you could use to validate the transaction before rolling it back, but this seems like a pretty esoteric need.

I’m only asking so I know what to expect. For my application, adding a shim layer that captures all mutations (allowing the underlying cockroach transaction to be truly read-only, and always committed) is actually easy, so it’s not a problem for me if I need to do that. I just want to know whether it’s necessary before doing it.

So I’ll take your answer as “Yes, you need to do that for now until if/when we make it our official policy to guarantee consistency of data read from non-committed transactions” (this is actually the outcome I expected).

In any case, I think it is important to point out this this subtlety on the web site, because it probably differs from what an average SQL developer expects.

Thanks.

1 Like