I’m looking at concurrent uses of a transaction with @nathan and one thing’s not clear to us: how come the storage layer supports doing writes in a txn whose txn record has not yet been written? This happens, for example, due to the
DistSender splitting batches and waiting for the first write batch to finish (the one with the
BeginTransaction request) before sending the second one.
@bdarnell seems to have said that the current code is fine, in https://github.com/cockroachdb/cockroach/issues/13684#issuecomment-281427763 :
The storage layer does (mostly) support transactional writes that occur before BeginTransaction has finished. These writes can occur naturally with the way that DistSender splits up batches (BeginTransaction is not guaranteed to be in the first batch). However, when this happens the transaction is vulnerable to being aborted since it automatically loses any pushes that occur before its transaction record has been written (see #9287), so we should try to ensure that BeginTransaction is at least concurrent with the first write (as long as it’s been proposed to raft before any subsequent writes complete, we know it will be safe from pushes that could be a consequence of those writes)
If I read this correctly, he’s saying that sending two batches in parallel is OK as long as you can guarantee that the first one is proposed to Raft (to its group) before the second one is proposed (to its group). But how do we / could we guarantee that?
The hazard I have in mind is:
- batches A and B are sent concurrently. A has BeginTxn.
- B is evaluated, proposed and applied. A is stalled.
- a read comes along, runs into B’s intents, attempts to push
- the push doesn’t find the txn record (since A hasn’t even been evaluated yet and is not in its command queue), so it succeeds in dropping the intent
- A applies
- the client eventually commits, not being aware of any push - but not all data is committed
The question is what prevents this? Perhaps the
AbortCache has something to do with it?