What's the probable cause of "Attempting to use transaction with wrong status or finalized: ABORTED"

Hi,

I encounter this issue during stress test, What’s the probable cause of “attempting to use transaction with wrong status or finalized: ABORTED”.

Here are some logs

2016-11-04 03:20:15	WARNING	[n1,s1,r123/1:/Table/14{2-3}] cmdQMu: mutex held by github.com/cockroachdb/cockroach/pkg/storage.(*Replica).beginCmds for 799.750511ms (>500ms):	storage/replica.go:546
2016-11-04 03:20:15	WARNING	[n1,s1,r123/1:/Table/14{2-3}] context deadline exceeded while in command queue: ResolveIntent [/Table/142/1/"1021209K30017874255"/"快运包干费"/0,/Min), ResolveIntent [/Table/142/1/"1021209K30017874255"/"快运包干费"/1/1,/Min), ResolveIntent [/Table/142/1/"1021209K30017874255"/"快运包干费"/3/1,/Min), ResolveIntent [/Table/142/1/"1021209K30017874255"/"快运包干费"/4/1,/Min), ResolveIntent [/Table/142/1/"1021209K30017874255"/"快运包干费"/5/1,/Min), ResolveIntent [/Table/142/1/"1021209K30017874255"/"快运包干费"/6/1,/Min), ResolveIntent [/Table/142/1/"1021209K30017874255"/"班列运费"/0,/Min), ResolveIntent [/Table/142/1/"1021209K30017874255"/"班列运费"/1/1,/Min), ResolveIntent [/Table/142/1/"1021209K30017874255"/"班列运费"/6/1,/Min), ResolveIntent [/Table/142/2/190575720799993857/0,/Min), ResolveIntent [/Table/142/2/190575831014572033/0,/Min), ResolveIntent [/Table/142/3/"20161104111719"/"1021209K30017874255"/"快运包干费"/0,/Min), ResolveIntent [/Table/142/3/"20161104111719"/"1021209K30017874255"/"班列运费"/0,/Min), ResolveIntent [/Table/142/4/"20161104111719"/"1021209K30017874255"/"快运包干费"/0,/Min), ResolveIntent [/Table/142/4/"20161104111719"/"1021209K30017874255"/"班列运费"/0,/Min), Noop [/Min,/Min), Noop [/Min,/Min), Noop [/Min,/Min), Noop [/Min,/Min), Noop [/Min,/Min), ... 70 skipped ..., Noop [/Min,/Min), Noop [/Min,/Min), Noop [/Min,/Min), Noop [/Min,/Min), Noop [/Min,/Min)	storage/replica.go:1233
2016-11-04 03:20:15	WARNING	[n1,s1,r128/1:/Table/14{7-8}]: failed to resolve intents: context deadline exceeded

W161104 11:20:07.159678 2432799 storage/intent_resolver.go:335  [n1,s1,r128/1:/Table/14{7-8}]: failed to resolve intent
s: context deadline exceeded
E161104 11:20:07.217426 2428166 internal/client/txn.go:326  [n1,client=192.168.100.240:37523] failure aborting transaction: internal/client/txn.go:628: attempting to use transaction with wrong status or finalized: ABORTED; abort caused by: retry txn "sql/executor.go:542 sql txn" id=cac4c82d key=/Table/147/1/"1021209K30017796872"/0 rw=true pri=0.03668810 iso=SERIALIZABLE stat=PENDING epo=1 ts=1478229517.998461813,1 orig=1478229517.998461813,1 max=1478229494.818106800,0 wto=false rop=false

W161104 11:20:07.924596 2432929 storage/replica.go:546  [n1,s1,r123/1:/Table/14{2-3}] cmdQMu: mutex held by github.com/cockroachdb/cockroach/pkg/storage.(*Replica).beginCmds for 754.456367ms (>500ms):

See the log file at : http://www.files.com/shared/581c3eaec4924/cockroach.hadoop241.wen.log.INFO.2016-11-04T11_20_04%2B08_00.zip

Cockroach version
Build Tag: beta-20161027
Build Time: 2016/10/27 13:34:56
Platform: linux amd64
Go Version: go1.7.3
C Compiler: gcc 4.9.2

Run with three nodes
OS: CentOS release 6.4 (Final)
uname -a
Linux hadoop241 2.6.32-358.el6.x86_64 #1 SMP Fri Feb 22 00:31:26 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

Also I found out that

CRITICAL SECTION TIME
11:42 raftmu 1006.6 peak

Cockroach Console.pdf (186.9 KB)

Thanks.

I think this message is harmless - it occurs when we try to rollback a transaction that has already aborted. I’ve filed https://github.com/cockroachdb/cockroach/issues/10490 to get rid of the log message in this case.

1 Like

Thank you.

To clarify what I understand:

  1. Is this mean client side’s transaction is success ( just a internal aborted, another retry exist, the final result is success ), Or
  2. Is this mean the transaction is failed and need client side retry ?

The statements are explicit put inside the transaction

I understand the log show on the database’s log file may harmless, Then why the error log appeared on the application’s log ( so I got little confused )

[2016-11-04 11:33:32,073] [ERROR] [FSController] [zPSaveTables][174] - Could not commit JDBC transaction; nested exception is org.postgresql.util.PSQLException: ERROR: restart transaction: retry txn "sql/executor.go:542 sql txn" id=d1adfab7 key=/Table/147/1/"1021209K30017964561"/0 rw=true pri=0.01602755 iso=SERIALIZABLE stat=PENDING epo=1 ts=1478230156.417237302,1 orig=1478230156.417237302,1 max=1478230137.620273321,0 wto=false rop=false

The application’s code is from our existing mysql related but changed to psql driver

The test is run by my colleagues ( using loadrunner against our java application which connect to the CockroachDB database )
Both of them (loadrunner and app) seems hard to tell if a transaction is success or not. ( I’ll try to figure it out later )

Still want to know, what do you think?

If the “restart transaction” error comes back to the application, then the transaction was definitely not committed and needs to be retried. There is more information about transaction retries in our docs: https://www.cockroachlabs.com/docs/transactions.html#transaction-retries

Okay.

Is this “not committed” related to the one above that aborted failure?

You said it’s harmless, Now I got transaction failure, I’m confused.

And do you know the reason, why the transaction is not committed ( cause by contention? )

It’s probably the same thing, but not necessarily. Sometimes the transaction that caused the server log message can be retried automatically, and you won’t get an error on the client. Other times you can get a “transaction restart” error on the client without this log message on the server. That’s why I say the log message is “harmless”: what matters is the error that the client sees, and not the server log message.

Thank you, It’s more clear to me.

Hi I got another question here

You said it’s at least involved rollback
This time, I do not see any rollbacks, Why still have this issue

[wen@hadoop241 logs]$ grep ABORT cockroach.hadoop241.wen.log.ERROR.2016-11-16T10_23_22+08_00.16104|wc -l
72
[wen@hadoop241 logs]$ grep ABORT cockroach.hadoop241.wen.log.ERROR.2016-11-16T10_23_22+08_00.16104|tail -1
E161116 16:20:58.530387 649178 internal/client/txn.go:331  [n1,client=192.168.100.240:50997,user=root] failure aborting transaction: internal/client/txn.go:641: attempting to use transaction with wrong status or finalized: ABORTED; abort caused by: retry txn "sql/executor.go:548 sql txn" id=a22fd420 key=/Table/59/1/"1021201K30016365627"/0 rw=true pri=0.08964003 iso=SERIALIZABLE stat=PENDING epo=1 ts=1479284445.383573831,1 orig=1479284445.383573831,1 max=1479284444.719211766,0 wto=false rop=false
[wen@hadoop241 logs]$ 

The previous app log applied here.

[2016-11-16 16:20:57,261] [ERROR] [FSController] [zPSaveTables][175] - Could not commit JDBC transaction; nested exception is org.postgresql.util.PSQLException: ERROR: restart transaction: retry txn "sql/executor.go:548 sql txn" id=6f81ef31 key=/Table/59/1/"1021201K30016363766"/0 rw=true pri=0.08964003 iso=SERIALIZABLE stat=PENDING epo=1 ts=1479284445.383573831,1 orig=1479284445.383573831,1 max=1479284442.725425544,0 wto=false rop=false
org.springframework.transaction.TransactionSystemException: Could not commit JDBC transaction; nested exception is org.postgresql.util.PSQLException: ERROR: restart transaction: retry txn "sql/executor.go:548 sql txn" id=8c6ea7da key=/Table/59/1/"1021201K30016347112"/0 rw=true pri=0.08964003 iso=SERIALIZABLE stat=PENDING epo=1 ts=1479284445.313288768,1 orig=1479284445.313288768,1 max=1479284442.731519525,0 wto=false rop=false
        at org.springframework.jdbc.datasource.DataSourceTransactionManager.doCommit(DataSourceTransactionManager.java:275)
        at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(AbstractPlatformTransactionManager.java:757)
        at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:726)
        at org.springframework.transaction.interceptor.TransactionAspectSupport.commitTransactionAfterReturning(TransactionAspectSupport.java:521)
        at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:291)
        at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:96)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
        at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:653)
        at com.bjjdsy.incomemuster.fs.service.BagIndexServiceImpl$$EnhancerBySpringCGLIB$$f1cd4c7c.insertAllTables(<generated>)
        at com.bjjdsy.incomemuster.fs.controller.FSController.zPSaveTables(FSController.java:130)
        at sun.reflect.GeneratedMethodAccessor129.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:221)
        at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:136)
        at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:110)
        at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:817)
        at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:731)
        at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85)
        at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:959)
        at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:893)
        at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:968)
        at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:870)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:643)
        at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:844)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:723)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at com.bjjdsy.commons.util.EncodingFilter.doFilter(EncodingFilter.java:41)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.apache.catalina.core.ApplicationDispatcher.invoke(ApplicationDispatcher.java:646)
        at org.apache.catalina.core.ApplicationDispatcher.processRequest(ApplicationDispatcher.java:436)
        at org.apache.catalina.core.ApplicationDispatcher.doForward(ApplicationDispatcher.java:374)
        at org.apache.catalina.core.ApplicationDispatcher.forward(ApplicationDispatcher.java:302)
        at com.bjjdsy.incomemuster.mainAll.controller.MainController.mainController(MainController.java:95)
        at sun.reflect.GeneratedMethodAccessor123.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:221)
        at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:136)
        at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:110)
        at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:817)
        at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:731)
        at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85)
        at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:959)
        at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:893)
        at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:968)
        at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:870)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:643)
        at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:844)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:723)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at com.bjjdsy.commons.util.EncodingFilter.doFilter(EncodingFilter.java:41)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:121)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
        at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861)
        at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:612)
        at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:503)
        at java.lang.Thread.run(Thread.java:662)
Caused by: org.postgresql.util.PSQLException: ERROR: restart transaction: retry txn "sql/executor.go:548 sql txn" id=8c6ea7da key=/Table/59/1/"1021201K30016347112"/0 rw=true pri=0.08964003 iso=SERIALIZABLE stat=PENDING epo=1 ts=1479284445.313288768,1 orig=1479284445.313288768,1 max=1479284442.731519525,0 wto=false rop=false
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2458)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2158)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:291)
        at org.postgresql.jdbc.PgConnection.executeTransactionCommand(PgConnection.java:779)
        at org.postgresql.jdbc.PgConnection.commit(PgConnection.java:800)
        at com.alibaba.druid.filter.FilterChainImpl.connection_commit(FilterChainImpl.java:180)
        at com.alibaba.druid.filter.stat.StatFilter.connection_commit(StatFilter.java:260)
        at com.alibaba.druid.filter.FilterChainImpl.connection_commit(FilterChainImpl.java:176)
        at com.alibaba.druid.proxy.jdbc.ConnectionProxyImpl.commit(ConnectionProxyImpl.java:123)
        at com.alibaba.druid.pool.DruidPooledConnection.commit(DruidPooledConnection.java:731)
        at org.springframework.jdbc.datasource.DataSourceTransactionManager.doCommit(DataSourceTransactionManager.java:272)

Is there a resource contention, that cause restart?

Console graph
Cockroach Console_2016-11-16.pdf (169.3 KB)

I realized it may cause by concurrent transaction, so restart is normal case here

Is this second restart failed ( a internal restart is done, but failed ), I thought the high priority should make the restart success, Am I wrong?

The doc suggest client side retry (if conditional logic exists ), So is the above restart fail, means I should have client side retry in any situation?

Yes, you should always be prepared to retry on the client side. Concurrent transactions can cause each other to restart at any time. High priority transactions are less likely to restart than low priority transactions, but this is not guaranteed. It’s even possible to get a transaction retry error when there are no other transactions (if a cockroachdb node restarts, transactions that were being managed by that node will have to restart)

1 Like

Thank you, It’s good to know.