Transactions don't seem to be fully isolated

I was playing around with isolation levels, in particular with regards to write skew anomalies, and ran into some odd interference between transactions. It took me a while to get the exact reproduction steps down, but this now semi-reliably works on beta-20161013. Is anyone able to explain what’s going on here?

I was working with a code version of the write skew anomaly example from the CMU class here: https://www.youtube.com/watch?v=-7VjPVmw4vM&t=16m23s

Data Setup:

root@:26257> CREATE TABLE colors (id SERIAL PRIMARY KEY, color STRING);
CREATE TABLE
root@:26257> INSERT INTO colors (color) VALUES ('white'), ('white'), ('black'), ('black');
INSERT 4
root@:26257> SELECT * FROM colors;
+--------------------+-------+
|         id         | color |
+--------------------+-------+
| 186857290596581377 | white |
| 186857290596646913 | white |
| 186857290596679681 | black |
| 186857290596712449 | black |
+--------------------+-------+
(4 rows)

Operation ordering to produce the anomaly follows. Note that this doesn’t happen for me 100% of the time. Other times, one of the two transactions is aborted, as I’d expect, but this happens with moderate frequency:

Terminal 1:

root@:26257> BEGIN TRANSACTION;
BEGIN
root@:26257> SAVEPOINT cockroach_restart;
OK
root@:26257> UPDATE colors SET color = 'black' WHERE color = 'white';
UPDATE 2

Terminal 2:

root@:26257> BEGIN TRANSACTION;
BEGIN
root@:26257> SAVEPOINT cockroach_restart;
OK
root@:26257> UPDATE colors SET color = 'white' WHERE color = 'black';

This call hangs for me, and doesn’t proceed until I commit the other transaction. From what I understand about the isolation of transactions in cockroach, everything inside terminal 2’s transaction should be existing inside of its own snapshot, and should not be affected by any other transactions, except for potential conflicts on COMMIT, is that right?

Continuing:

Terminal 1:

root@:26257> COMMIT;
COMMIT
root@:26257> SELECT * FROM colors;
+--------------------+-------+
|         id         | color |
+--------------------+-------+
| 186865464373280769 | black |
| 186865464373346305 | black |
| 186865464373379073 | black |
| 186865464373411841 | black |
+--------------------+-------+
(4 rows)

At this point, the stuck UPDATE statement in terminal 2 completes, and produces the output “UPDATE 4”. Again, according to snapshot isolation, these two transactions should exist in their own worlds, and not be able to influence each other, unless I am misunderstanding something.

Terminal 2:

UPDATE 4
root@:26257> COMMIT;
COMMIT
root@:26257> SELECT * FROM colors;
+--------------------+-------+
|         id         | color |
+--------------------+-------+
| 186865464373280769 | white |
| 186865464373346305 | white |
| 186865464373379073 | white |
| 186865464373411841 | white |
+--------------------+-------+
(4 rows)

As I mentioned above, this behavior isn’t reproducible 100% of the time. But if you re-run the BEGIN, SAVEPOINT, and UPDATE commands together in one terminal and then a second, it should reproduce at some point. If not I just rollback the transaction and try again.

Hi, and thank you for using CockroachDB!

Can you explain what you expected to happen and how it differs from what happened? Everything you posted is expected under “full” SERIALIZABLE isolation, which is the default mode. Since you mentioned that the behavior is not reproducible 100% of the time, I’m also wondering what results you got the other times.

First question: “This call hangs for me…”. Under serializability, if this transaction was started after the other one, we need to read the results of the first transaction as part of the UPDATE. So the transaction needs to wait for the other one.

The final results look good to me. The point is that after these transactions, under full serializability, the result can only be “all white” or “all black”. The half-white-half-black result that Andy talks about in the lecture is an “anomaly”, something that cannot happen if we guarantee serializability.

If you want to produce the anomaly, you need to use the snapshot isolation mode; see info here on how to set it (note that it needs to be set on both transactions): https://www.cockroachlabs.com/docs/transactions.html

Hope this helps.

Hey Radu, Thanks for getting back to me!

If you want to produce the anomaly, you need to use the snapshot isolation mode

Sorry, I should have been more clear, I had already done that, and was exploring how the same scenario behaves at the SERIALIZABLE isolation level.

I think part of my confusion is I thought that Cockroach still used snapshot isolation at the SERIALIZABLE level (all transactions operate on a consistent snapshot of the database at the time it’s started), just with the additional constraint that every operation be able to be serially ordered. Tho reading the docs more closely this doesn’t appear to be the case.

Since you mentioned that the behavior is not reproducible 100% of the time, I’m also wondering what results you got the other times.

Maybe the other results are the incorrect ones, even tho they were more what I expected, or maybe both are expected and I’m missing something about how transactions interact. In other runs (and just reproduced locally to confirm), the second UPDATE operation to run doesn’t hang on the first transaction to commit. Both produce the output UPDATE 2, and proceed along. However, when the transactions are committed, one completes fine, and the other gets a restart error:

root@:26257> COMMIT TRANSACTION;
pq: restart transaction: retry txn "sql/executor.go:544 sql txn" id=17ad18e0 key=/Table/84/1/186870620817653761/0 rw=true pri=0.04268927 iso=SERIALIZABLE stat=PENDING epo=1 ts=1477153544.498321406,1 orig=1477153544.498321406,1 max=1477153517.935023336,0 wto=false rop=false

This made more sense to me, as it seemed odd that a transaction could cause another to hang indefinitely, but I guess this needs to be the case to support long-running transactions to be able to complete, is that right?

Indeed, there is no anomaly here. The non-determinism you’re seeing has to do with how transaction conflicts are handled in Cockroach; rather than last-in-wins, there is some randomness involved.

Consider the case which behaves as you expect:
txn1: set black: OK
txn2: set white: OK
txn2: commit: OK
txn1: commit: restart transaction …

What happened here is that txn2 aborted txn1, and all is well (and expected). In the case which doesn’t behave as you expect, the following is observed:
txn1: set black: OK
txn2: set white: "hang"
txn1: commit: OK
txn2: "unhangs"
txn2: commit: OK

What really happened here is that when you sent set white in txn2, it ran into txn1’s intents, and started trying to abort them; because you are in a SQL shell, these attempts to abort block you, but under the covers, the system is still polling and attempting to abort txn1 (it’s just losing dice rolls and is subject to backoff, so it may take a long time to succeed, but it will eventually succeed). Once txn1 commits, the intents are gone, and the next attempt by txn2 to abort txn1 simply proceeds and writes its intents, and then commits.

Does that answer your question?

Hey there Tamir,

That does clear it up, thanks :slight_smile:

One more follow up question - if the write intents are written for each key/value entry, at what point would the second transaction be hitting the intents of the first? These operations are working on non-overlapping sets of rows, so it doesn’t seem that either should run into the intents of the other. My first guess was that it would be seeing the intents as it does a table scan to evaluate the WHERE clause, but I also tried adding an index on color, such that rows can be accessed without necessitating a full table scan, and it still recognizes the conflict and aborts. An explain shows that this is working as expected:

+-------+--------+----------------------------------------+
| Level |  Type  |              Description               |
+-------+--------+----------------------------------------+
|     0 | update |                                        |
|     1 | scan   | colors@color_idx /"white"-/"white\x00" |
+-------+--------+----------------------------------------+

Having the conflict be noticed and a transaction aborted is expected and awesome, I’m just curious about which write intents are actually being read in this case, where neither transaction should be writing to, or even reading from the data changed by the other.

Even in the presence of an index, it is not possible for the transactions
to entirely avoid overlap; remember that both transactions mutate the
index, too, and so intents are bound to be encountered on the index itself.

1 Like

In addition to write intents, we also keep track of when other transactions read keys and ranges of keys. In this example (with the index), the first transaction has to scan the range of keys that start with the prefix .../white. We would remember that this transaction tried to read this range at a certain timestamp. When the second transaction tries to add new keys inside that range, it detects the other transaction.

If we didn’t do this, we wouldn’t guarantee serializability: the anomaly you are looking at would be possible.

1 Like

Got it. Thanks for clearing that up!