I am trying to understand if there are any situations when CockroachDB may block in case of unfortunate node failure during commit of multi-range transaction. I would appreciate if you help me understand CockroachDB behavior in the following scenario:
- There are two ranges in the system with replication factor 3: 1 and 2
- There are 7 nodes, 3 for each range plus transaction gateway node: N1_1, N1_2, N1_3, N2_1, N2_2, N2_3, N_TX
- Transaction is started on a gateway node
- “PENDING” tx record is written to range 1 nodes as first updated key belongs to this range
- Some writes are performed to both 1 and 2 ranges
- Finally, we loose both gateway node as well as majority of range1 nodes. E.g. N1_1, N1_2 and N_TX are out of topology.
At this point range 1 is unavailable, so we have no clue whether tx was committed or not, range 2 has some dangling write intents. Could you please explain whether there locks will be released or not? If they will be released at some point, then how it is guaranteed that transaction is not in COMMITTED state on unavailable range 1, and that nobody had been able to read updated values from range 1 before it went unavailable?