Transaction rollback to savepoint fails "ROLLBACK TO SAVEPOINT cockroach_restart"

Hi,

I’ve faced some concurrency issues during application migration from postgresql to cockroachdb. Migration is done from functional perspective as far as our tests work fine when run in one thread. Strange things start to happen when I run tests in 2+ threads.

I run our app and cockroach:v2.0.2 in docker swarm. Run it in single node mode.
Configuration:
./cockroach zone get .default --insecure
.default
range_min_bytes: 1048576
range_max_bytes: 67108864
gc:
ttlseconds: 300
num_replicas: 1
constraints: []
Service under the test is written in java and uses Hibernate 5.2.8.Final as ORM. In this service logic for transaction retrying is implemented. Actually problem in case of transaction retrying.

One of such exception description. During transaction commit conflict detected and exception thrown
2018-07-03 14:49:07,0006 b960d6e994d20bdd6a196e2ef3c5889a WARN org.hibernate.engine.jdbc.spi.SqlExceptionHelper - SQL Error: 0, SQLState: 40001
2018-07-03 14:49:07,0006 b960d6e994d20bdd6a196e2ef3c5889a ERROR org.hibernate.engine.jdbc.spi.SqlExceptionHelper - ERROR: restart transaction: HandledRetryableTxnError: TransactionAbortedError: txn aborted “sql txn” id=1355127f key=/Table/112/1/362279557243502593/“FILTER_SET”/0 rw=true pri=0.19076937 iso=SERIALIZABLE stat=PENDING epo=0 ts=1530629322.377481940,1 orig=1530629321.812578598,0 max=1530629321.812578598,0 wto=false rop=false seq=7080

Transaction handler (attached) is resolving conflict by reverting to created earlier savepoint. Revert to savepoint fails

2018-07-03 14:49:07,0041 b960d6e994d20bdd6a196e2ef3c5889a WARN org.hibernate.engine.jdbc.spi.SqlExceptionHelper - SQL Error: 0, SQLState: 25P02 
2018-07-03 14:49:07,0041 b960d6e994d20bdd6a196e2ef3c5889a ERROR org.hibernate.engine.jdbc.spi.SqlExceptionHelper - ERROR: Expected "ROLLBACK TO SAVEPOINT COCKROACH_RESTART": current transaction is aborted, commands ignored until end of transaction block 

2018-07-03 14:49:07,0045 bb28b0b47d2d4435dde872c19cbbaecf ERROR org.hibernate.engine.jdbc.spi.SqlExceptionHelper - ERROR: Expected "ROLLBACK TO SAVEPOINT COCKROACH_RESTART": current transaction is aborted, commands ignored until end of transaction block 
2018-07-03 14:49:07,0047 b960d6e994d20bdd6a196e2ef3c5889a ERROR com.aginity.amp.common.port.adapter.rest.exception.LoggingExceptionMapper - Error handling a request: 1265e2deb95af833 Exception: com.aginity.amp.catalogservice.application.transaction.TxnRestartException: javax.persistence.PersistenceException: org.hibernate.exception.GenericJDBCException: could not execute statement
Caused by: javax.persistence.PersistenceException: org.hibernate.exception.GenericJDBCException: could not execute statement
Caused by: org.hibernate.exception.GenericJDBCException: could not execute statement
Caused by: org.postgresql.util.PSQLException: ERROR: Expected "ROLLBACK TO SAVEPOINT COCKROACH_RESTART": current transaction is aborted, commands ignored until end of transaction block
	at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2455)
	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2155)
	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:288)
	at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:430)
	at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:356)
	at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:168)
	at org.postgresql.jdbc.PgPreparedStatement.executeUpdate(PgPreparedStatement.java:135)
		37 lines skipped for [org.hibernate]
	at com.aginity.amp.catalogservice.application.transaction.Txn.execInTx(Txn.java:103)
	at com.aginity.amp.catalogservice.application.transaction.Txn.execTx(Txn.java:41)
	at com.aginity.amp.catalogservice.application.transaction.TransactionWrapper.invoke(TransactionWrapper.java:26)
	at com.aginity.amp.catalogservice.port.adapter.rest.resources.asset.FrameResource.createFrame(FrameResource.java:149)

I thought it could be an connected to hibernate transaction manager so I added custom one as described in https://github.com/cockroachdb/hibernate-savepoint-fix. It changed nothing.

Almost same retrying logic worked fine for postgresql. Please help to make it work for cockroach?

Resources can be found here:


Txn.java - logic for transaction revert
cockroach_4_thread.log.zip - file containing logs for tests run in one thread + in four threads
cat_4_thread.log.zip - file containing logs for tests run in one thread + in four threads

There is an update here…
I’ve localized error and know that createFrame method leads to “rollback to savepoint” error.
createFrame checks if user has sufficient permissions and all required entities do exist (sql select to project, workspace, permission, tables). After preconditions check passed frame inserted into asset table.
If createFrame method executed 5 times consequentially everything goes fine. 5 more records added to asset table as a result.
If 5 createFrame calls executed in 4 threads concurrently from 2 to 5 of them succeed.
Each rest call has its own correlation id so I can see what happens during the call.
I put file create_frame.logs with last run results when 3 of 5 calls succeeded to git repo.
From logs I see that for request with correlation id ‘54dfed58b811d5cb772336b4b6dba33b’ there were two retries and request succeeded.
For request ‘d5edba0319b6a6d93c4d53d85a46ac9c’ there was one retry attempt after what error

2018-07-04 14:28:09,0885 d5edba0319b6a6d93c4d53d85a46ac9c ERROR org.hibernate.engine.jdbc.spi.SqlExceptionHelper - ERROR: restart transaction: HandledRetryableTxnError: TransactionAbortedError: txn aborted “sql txn” id=dd7720bf key=/Table/112/1/362558550268739585/“FILTER_SET”/0 rw=true pri=0.17730189 iso=SERIALIZABLE stat=PENDING epo=1 ts=1530714466.495402411,1 orig=1530714466.495402411,1 max=1530714466.994169487,0 wto=false rop=false seq=8007

was thrown by cockroachdb. My transaction handler gives up after this error.

Why could this error be thrown for some of calls? Any suggestions how to deal with this situation?

Paging @Bram for hibernate expertise. At the moment I believe it’s not fully supported and non-trivial. Unsure whether this specific issue is known or not.

Hi @vladimir_shulga,

We’re actually working on just this problem right now. What happens is this is a combination of the fact that our transactions work slightly differently than postgres’ and Hibernate doesn’t know how to handle this just yet and we return some different errors.

This example should show you how to approach it without Hibernate, and we’re actively working on getting it working properly with Hibernate.

We have a number of issues still to get Hibernate working, here’s the meta issue for it: https://github.com/cockroachdb/cockroach/issues/16491

Please feel free to post your issue directly as a bug in our github. External requests help us prioritize fixes and updates.

Hi @Bram,

Thank you for clarifying this question out!
I’ve created the issue https://github.com/cockroachdb/cockroach/issues/27274. Do you know how much could it take to have it fixed?