I am looking at the CockroachDB code on how the end-to-end flow of a write request, especially the raft replica level. It seems the tryExecuteWriteBatch handles the core logic, which proposes the request to raft, and wait until the request is applied to the state machine. It also returns whether this write request is retryable or non-retryable. If that is correct, my questions are:
How do you handle the case when the write is committed in raft (with quorum consensus), but failed to apply to the state machine? In that case, the write request may be abandoned, but the log is actually committed which will be applied by raft eventually. If this is a transaction write, then does that mean the transaction is aborted, while actually it’s committed in raft? Just wondering if this is an issue to you, or you simply relies on the idempotency of state machine.
Related to that, I wonder if it’s possible that tryExecuteWriteBatch returns as soon as the write request is committed in raft, and don’t wait for the apply to happen. And for a transaction, we mark it as committed the same way.