Thank you very much for your report, and I apologize for the delayed response. It looks like you’ve uncovered an interesting bug in our handling of deadlocking transactions. This code changed significantly in the beta-20170309 release, but I think the crucial part of the bug was the same before and after the change.
The issue occurs when transactions perform writes before reads. Each transaction is able to lay down intents for its initial write, which means that it must either commit or be aborted before any other transaction will be able to read that value. The problem is that if the only source of conflicting queries are reads (the writes in this workload never conflict with each other), we never attempt to abort the transaction. The faulty code is here, where we decide whether or not to abort the conflicting transaction based on whether the current operation is a read or a write. Instead, we should abort for reads as well, if they are part of a writing transaction.
Prior to beta-20170309, we aborted transactions based on their priority, so a few threads would get through while others would get stuck in a backoff loop. beta-20170309 made the process deterministic rather than priority-based, so all of the threads would get stuck (and it also removed the backoff from the loop, so CPU usage on the server would go to 100%)
We’ll work on getting this fixed for the next beta release.