Cockroach Transaction throwing exceptions

I am trying to use cockroach transaction as explained in this article.

I am seeing the below error quite frequently in my log

ERROR [2017-11-02 11:49:00,855] [U:480,F:167,T:647,M:1,878] storage.config.CockroachConfigStore:[CockroachConfigStore:retryTransaction:189] - [config-store-exec-3] - Got Exception while trying to restart
org.postgresql.util.PSQLException: ERROR: restart transaction: HandledRetryableTxnError: TransactionRetryError: retry txn (RETRY_SERIALIZABLE): "sql txn" id=b0b42214 key=/Table/55/1/10000/115/319817494/"_canonical"/1509623337641/0 rw=true pri=0.05295663 iso=SERIALIZABLE stat=PENDING epo=0 ts=1509623340.835374472,1 orig=1509623340.832201525,0 max=1509623340.832201525,0 wto=false rop=false seq=2
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2284)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2003)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:200)
        at org.postgresql.jdbc.PgConnection.executeTransactionCommand(PgConnection.java:768)
        at org.postgresql.jdbc.PgConnection.commit(PgConnection.java:780)
        at org.apache.commons.dbcp.DelegatingConnection.commit(DelegatingConnection.java:334)
        at org.apache.commons.dbcp.PoolingDataSource$PoolGuardConnectionWrapper.commit(PoolingDataSource.java:211)
        at com.vnera.storage.config.CockroachConfigStore.retryTransaction(CockroachConfigStore.java:181)
        at com.vnera.storage.config.CockroachConfigStore.addProperty(CockroachConfigStore.java:36)

This is what I am trying to do -

I have two tables:

CREATE TABLE config_blob (
    customerid integer NOT NULL,
    objecttype integer NOT NULL,
    objectid bigint NOT NULL,
    propertyname text NOT NULL,
    timeinstant bigint NOT NULL,
    updatetimestamp timestamp with time zone DEFAULT now(),
    value text,
    PRIMARY KEY (customerid, objecttype, objectid, propertyname, timeinstant)
);

CREATE TABLE config_versions (
    customerid integer NOT NULL,
    objecttype integer NOT NULL,
    objectid bigint NOT NULL,
    propertyname text NOT NULL,
    timeinstant bigint NOT NULL,
    updatetimestamp timestamp with time zone DEFAULT now() NOT NULL,
    PRIMARY KEY (customerid, objecttype, objectid, propertyname, timeinstant, updatetimestamp),
    INDEX config_versions_by_version (updatetimestamp)
);

At a high level in a single transaction I am doing the following:-

  1. Insert a value in config_blob
  2. If the insert in #1 is successful make an entry in config_version and return

If the above transaction fail then

  1. Update a value in config_blob
  2. If update in #1 is successful update the value in config_version

The relevant code is like below-

@Override
protected void addProperty(Connection conn,
    ConfigKey key,
    String propertyName,
    long timeStamp,
    String escapedValue,
    String tableName) throws SQLException {

    try {
        conn.setAutoCommit(false);
        boolean insertSuccess = retryTransaction(
            conn,
            addPropertyInDBAsTransaction(key, propertyName, timeStamp, escapedValue, tableName)
        );
        if (insertSuccess) {
            logger.info("Insertion of property transaction is successful");
            return;
        }

        logger.info("Insertion of property failed. Going for update");

        conn.setAutoCommit(false);
        boolean updateSuccess = retryTransaction(
            conn,
            updatePropertyinDBAsTransaction(key, propertyName, timeStamp, escapedValue, tableName)
        );
        if (!updateSuccess) {
            throw new SQLException("insert and update both failed.");
        }

        logger.info("Updated existing property - transaction successful");
    } finally {
        conn.close();
    }
}

private RetryableTransaction addPropertyInDBAsTransaction(ConfigKey key,
    String propertyName,
    long timeStamp,
    String escapedValue,
    String tableName) throws SQLException {
    return conn1 - > {
        try (Statement stmt = conn1.createStatement()) {
            try {
                String sql = getSqlQueryForAddProperty(tableName, key, propertyName, timeStamp, escapedValue, false);
                logger.info("Property Insert sql {} {}", tableName, sql);
                int nrows = stmt.executeUpdate(sql);
                if (nrows > 0) {
                    sql = String.format("insert into %s " +
                        " (customerId, objecttype, objectid, propertyname, timeinstant) " +
                        " values (%d, %d, '%d'::int8, '%s', '%d'::int8)",
                        config.getVersionsTableName(), key.customerId, key.objectType, key.objectId,
                        propertyName, timeStamp);
                    logger.info("Property Version Insert sql in {} {}", config.getVersionsTableName(), sql);
                    stmt.executeUpdate(sql);
                    return true;
                }
            } catch (SQLException e) {
                logger.warn("overwriting property value!");
                return false;
            }
        }
        return false;
    };
}


private int updateConfig(Statement stmt, String tableName, ConfigKey key, String propertyName, long timeStamp,
    String escapedValue) throws SQLException {
    String sql = getSqlQueryForAddProperty(tableName, key, propertyName, timeStamp, escapedValue, true);
    logger.info("Property Update in {} {}", tableName, sql);
    int nrows = stmt.executeUpdate(sql);
    if (nrows > 0) {
        sql = String.format("update %s set updatetimestamp = now() where " +
            "customerid = %d and " +
            "objectType = %d and " +
            "objectid = '%d'::int8 and " +
            "propertyName = '%s' and " +
            "timeinstant = '%d'::int8 ;",
            config.getVersionsTableName(), key.customerId, key.objectType, key.objectId, propertyName, timeStamp
        );
        logger.info("Property Update in {} {}", config.getVersionsTableName(), sql);
        stmt.executeUpdate(sql);
    }
    return nrows;
}

private boolean retryTransaction(Connection conn, RetryableTransaction transaction) throws SQLException {
    Savepoint sp = conn.setSavepoint("cockroach_restart");
    while (true) {
        try {
            boolean isSucess = transaction.run(conn);
            conn.commit();
            logger.info("Transaction successful");
            return isSucess;
        } catch (SQLException e) {
            if (e.getErrorCode() == 40001) {
                // Signal the database that we will attempt a retry.
                conn.rollback(sp);
            } else {
                logger.error(String.format("Got exception with error code %d", e.getErrorCode()), e);
                Throwables.propagate(e);
            }
        }
    }
}

protected String getSqlQueryForAddProperty(
    String tableName,
    ConfigKey key,
    String propertyName,
    long timeStamp,
    String escapedValue,
    boolean isUpdate) {
    if (!isUpdate) {
        return String.format("insert into %s (customerId, objecttype, objectid, propertyname, timeinstant, value) " +
            " values (%d, %d, '%d'::int8, '%s', '%d'::int8, %s)",
            tableName, key.customerId, key.objectType, key.objectId, propertyName, timeStamp, escapedValue
        );
    }
    return String.format("update %s set value = %s where " +
        "customerid = %d and " +
        "objectType = %d and " +
        "objectid = '%d'::int8 and " +
        "propertyName = '%s' and " +
        "timeinstant = '%d'::int8 ;",
        tableName, escapedValue, key.customerId, key.objectType, key.objectId, propertyName, timeStamp
    );
}


// A simple interface that provides a retryable lambda expression.
interface RetryableTransaction {
    boolean run(Connection conn) throws SQLException;
}

The detailed relevant logs -

INFO [2017-11-02 11:49:00,832] [U:480,F:167,T:647,M:1,878] storage.config.CockroachConfigStore:[CockroachConfigStore:lambda$addPropertyInDBAsTransaction$2:134] - [config-store-exec-3] - Property Insert sql config_blob insert into config_blob (customerId, objecttype, objectid, propertyname, timeinstant, value)  values (10000, 115, '319817494'::int8, '_canonical', '1509623337641'::int8, '{
  "@class" : "com.van.model.core.InventoryData",
  "objectGuid" : "[VM]-[inventory]-[10.197.17.204]",
  "vendorId" : null,
  "assetIdentifier" : null,
  "name" : null,
  "otherNames" : null,
  "manager" : {
    "@class" : "com.van.model.core.Reference",
    "modelKey" : {
      "key" : "10000:8:935316259"
    },
    "objectGuid" : "X.X.X.X",
    "propertyName" : "_canonical",
    "name" : null
  },
  "modelKey" : {
    "key" : "10000:115:319817494"
  },
  "groups" : [ "java.util.HashSet", [ ] ],
  "subTypes" : [ "java.util.HashSet", [ ] ],
  "description" : null,
  "propertyBag" : {
    "@class" : "com.van.model.core.PropertyBag",
    "propertyValues" : [ "java.util.HashSet", [ ] ]
  },
  "associationEntities" : {
    "@class" : "java.util.HashMap"
  },
  "payload" : "ChMxMDAwMDoxMTU6MzE5ODE3NDk0EhIxMDAwMDoxOjE5ODMwMTY5NTASEjEwMDAwOjE6MTk4MzAxNjk",
  "basePayload" : ""
}')
INFO [2017-11-02 11:49:00,837] [U:480,F:167,T:647,M:1,878] storage.config.CockroachConfigStore:[CockroachConfigStore:lambda$addPropertyInDBAsTransaction$2:142] - [config-store-exec-3] - Property Version Insert sql in config_versions insert into config_versions  (customerId, objecttype, objectid, propertyname, timeinstant)  values (10000, 115, '319817494'::int8, '_canonical', '1509623337641'::int8)

ERROR [2017-11-02 11:49:00,855] [U:480,F:167,T:647,M:1,878] storage.config.CockroachConfigStore:[CockroachConfigStore:retryTransaction:189] - [config-store-exec-3] - Got Exception while trying to restart
org.postgresql.util.PSQLException: ERROR: restart transaction: HandledRetryableTxnError: TransactionRetryError: retry txn (RETRY_SERIALIZABLE): "sql txn" id=b0b42214 key=/Table/55/1/10000/115/319817494/"_canonical"/1509623337641/0 rw=true pri=0.05295663 iso=SERIALIZABLE stat=PENDING epo=0 ts=1509623340.835374472,1 orig=1509623340.832201525,0 max=1509623340.832201525,0 wto=false rop=false seq=2
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2284)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2003)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:200)
        at org.postgresql.jdbc.PgConnection.executeTransactionCommand(PgConnection.java:768)
        at org.postgresql.jdbc.PgConnection.commit(PgConnection.java:780)
        at org.apache.commons.dbcp.DelegatingConnection.commit(DelegatingConnection.java:334)
        at org.apache.commons.dbcp.PoolingDataSource$PoolGuardConnectionWrapper.commit(PoolingDataSource.java:211)
        at com.van.storage.config.CockroachConfigStore.retryTransaction(CockroachConfigStore.java:181)
        at com.van.storage.config.CockroachConfigStore.addProperty(CockroachConfigStore.java:36)
        at com.van.storage.config.PSqlConfigStore$5.call(PSqlConfigStore.java:381)
        at com.van.storage.config.PSqlConfigStore$5.call(PSqlConfigStore.java:376)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at com.van.common.VneraExecutors$Queue$TimedRunnable.run(VneraExecutors.java:212)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

Cockroach Version-

ubuntu@platform:~/logs$ cockroach version
Build Tag:    v1.1.0
Build Time:   2017/10/12 14:50:18
Distribution: CCL
Platform:     linux amd64
Go Version:   go1.8.3
C Compiler:   gcc 6.3.0
Build SHA-1:  8b865035e21aa4fa526ee017ba9dc685d7af649c
Build Type:   release-gnu

Can some one let me know what I am doing wrong or there is some limitation on the cockroach side?

Let me know if you need any more information from my side.

Interesting, this looks like the exception caught here

        } catch (SQLException e) {
            if (e.getErrorCode() == 40001) {
                // Signal the database that we will attempt a retry.
                conn.rollback(sp);
            } else {
                logger.error("Got Exception while trying to restart ", e);
                Throwables.propagate(e);
            }
        }

doesn’t have the right error code. Could you augment the “Got exception while…” message to print e.getErrorCode() as well?

@tschottdorf - Will make the change and share the updated logs soon.

@tschottdorf - Modified by log line as follows-

logger.error(String.format("Got exception with error code %d while trying to restart ", e.getErrorCode()), e);

Now the exception trace -

ERROR [2017-11-02 16:53:03,133] [U:152,F:546,T:698,M:1,878] storage.config.CockroachConfigStore:[CockroachConfigStore:retryTransaction:189] - [config-store-exec-1] - Got exception with error code 0 while trying to restart
org.postgresql.util.PSQLException: ERROR: restart transaction: HandledRetryableTxnError: TransactionRetryError: retry txn (RETRY_SERIALIZABLE): "sql txn" id=7f15524d key=/Table/54/1/10000/9/1266132426/"_canonical"/1509641582298/0 rw=true pri=0.00573327 iso=SERIALIZABLE stat=PENDING epo=0 ts=1509641583.122960157,1 orig=1509641583.115187496,0 max=1509641583.115187496,0 wto=false rop=false seq=2
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2284)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2003)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:200)
        at org.postgresql.jdbc.PgConnection.executeTransactionCommand(PgConnection.java:768)
        at org.postgresql.jdbc.PgConnection.commit(PgConnection.java:780)
        at org.apache.commons.dbcp.DelegatingConnection.commit(DelegatingConnection.java:334)
        at org.apache.commons.dbcp.PoolingDataSource$PoolGuardConnectionWrapper.commit(PoolingDataSource.java:211)
        at com.van.storage.config.CockroachConfigStore.retryTransaction(CockroachConfigStore.java:181)
        at com.van.storage.config.CockroachConfigStore.addProperty(CockroachConfigStore.java:36)

Error Code is coming as zero.

Thanks @tuk, that looks suspicious. I suspect that it might be related to https://github.com/cockroachdb/docs/issues/2121

Try adding releaseSavepoint as is done in that link and report back with the results. My colleague @arjun will look into this if anything’s left at that point.

Hi @tuk, I’ll take a look at this, and I’m going to modify the example code to releaseSavepoint in any case. Let me know if that’s sufficient to resolve the problem.

@arjun @tschottdorf - I have added the conn.releaseSavepoint() and also modified the check for rollback as e.getErrorCode() == 40001 || e.getSQLState().equals("40001").

I will share my findings soon.

@arjun @tschottdorf Now getting the below exception -

INFO [2017-11-03 05:35:58,905] [U:445,F:165,T:610,M:1,878] storage.config.CockroachConfigStore:[CockroachConfigStore:lambda$addPropertyInDBAsTransaction$2:142] - [config-store-exec-4] - Property Version Insert sql in config_versions insert into config_versions  (customerId, objecttype, objectid, propertyname, timeinstant)  values (10000, 86, '526370774'::int8, '_denorm', '1509687277668'::int8)
ERROR [2017-11-03 05:35:58,906] [U:445,F:165,T:610,M:1,878] storage.config.CockroachConfigStore:[CockroachConfigStore:retryTransaction:190] - [config-store-exec-6] - Got exception with error code 0 while trying to restart
org.postgresql.util.PSQLException: ERROR: current transaction is aborted, commands ignored until end of transaction block
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2284)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2003)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:200)
        at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:424)
        at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:321)
        at org.postgresql.jdbc.PgConnection.execSQLUpdate(PgConnection.java:472)
        at org.postgresql.jdbc.PgConnection.releaseSavepoint(PgConnection.java:1592)
        at org.apache.commons.dbcp.DelegatingConnection.releaseSavepoint(DelegatingConnection.java:461)
        at org.apache.commons.dbcp.PoolingDataSource$PoolGuardConnectionWrapper.releaseSavepoint(PoolingDataSource.java:380)
        at com.van.storage.config.CockroachConfigStore.retryTransaction(CockroachConfigStore.java:181)
        at com.van.storage.config.CockroachConfigStore.addProperty(CockroachConfigStore.java:36)
        at com.van.storage.config.PSqlConfigStore$5.call(PSqlConfigStore.java:381)
        at com.van.storage.config.PSqlConfigStore$5.call(PSqlConfigStore.java:376)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at com.van.common.VneraExecutors$Queue$TimedRunnable.run(VneraExecutors.java:212)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

Saw this commit https://github.com/arjunravinarayan/docs/blob/2cc92f9895da93b24afbb0f4e8b1eee67db1a604/_includes/app/TxnSample.java

How should we handle the AmbigiousException for now?

@arjun - Any thoughts?

@tuk it seems to me that, somehow, your code is trying to run a statement after the transaction has already been aborted. That’s what this error says:

Would you mind running crdb with the --vmodule=executor=2,session=2 flag, and then sharing its log? That flag will make it log all the statements that it’s running so we can see who’s getting the error exactly.

FWIW, I think your code is a bit confused. You’re printing “Got Exception while trying to restart” from a place where you haven’t actually tried to restart (the restart is the conn.rollback(sp) call. On the else branch where you do Throwables.propagate(e); you’re handling non-retriable errors.

As a meta comment, we should never return errors with code 0. If that indeed is happening, that’s clearly a bug.

@andrei - Please find attached crdb logs as you suggested. db_logs.ttf (2.4 MB)

I have modified my code as mentioned in this link

private boolean retryTransaction(Connection conn, RetryableTransaction transaction) throws SQLException {
        Savepoint sp = conn.setSavepoint("cockroach_restart");
        boolean isSucess = false;
        while (true) {
            try {
                isSucess = transaction.run(conn);
                conn.releaseSavepoint(sp);
                logger.info("Transaction successful");
                break;
            } catch(SQLException e) {
                if(e.getErrorCode() == 40001) {
                    // Signal the database that we will attempt a retry.
                    conn.rollback(sp);
                } else if(e.getErrorCode() == 40007) {
                    logger.warn("Ambigious Result Encountered", e);
                    return false;
                } else {
                    // This is a not a serialization failure, pass it up the chain.
                    throw e;
                }
            }
        }
        conn.commit();
        return isSucess;
    }

I was wrong. I am still seeing the error like below-

Caused by: org.postgresql.util.PSQLException: ERROR: restart transaction: HandledRetryableTxnError: TransactionRetryError: retry txn (RETRY_SERIALIZABLE): "sql txn" id=28548ae6 key=/Table/54/1/10000/9/1266132426/"_canonical"/1509949684699/0 rw=true pri=0.01738160 iso=SERIALIZABLE stat=PENDING epo=0 ts=1509949685.607127411,1 orig=1509949685.606605709,0 max=1509949685.606605709,0 wto=false rop=false seq=2
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2284)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2003)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:200)
        at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:424)
        at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:321)
        at org.postgresql.jdbc.PgConnection.execSQLUpdate(PgConnection.java:472)
        at org.postgresql.jdbc.PgConnection.releaseSavepoint(PgConnection.java:1592)
        at org.apache.commons.dbcp.DelegatingConnection.releaseSavepoint(DelegatingConnection.java:461)
        at org.apache.commons.dbcp.PoolingDataSource$PoolGuardConnectionWrapper.releaseSavepoint(PoolingDataSource.java:380)
        at com.vnera.storage.config.CockroachConfigStore.retryTransaction(CockroachConfigStore.java:184)
        at com.vnera.storage.config.CockroachConfigStore.addProperty(CockroachConfigStore.java:35)
        at com.vnera.storage.config.PSqlConfigStore$5.call(PSqlConfigStore.java:381)
        at com.vnera.storage.config.PSqlConfigStore$5.call(PSqlConfigStore.java:376)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at com.vnera.common.VneraExecutors$Queue$TimedRunnable.run(VneraExecutors.java:212)```

Cockroach logs are attached. Please rename `*.ttf` to `*.gz`. <a class="attachment" href="//discourse-cloud-file-uploads.s3.dualstack.us-west-2.amazonaws.com/business7/uploads/cockroachlabs/original/1X/f7c32966d18b166e0ac84c2f5996dd0c6185b0ee.ttf">db_logs.ttf</a> (2.4 MB)

For our use case I have started to use SNAPSHOT_ISOLATION -

conn.setTransactionIsolation(Connection.TRANSACTION_READ_COMMITTED);

I am not getting the TransactionRetryError: retry txn (RETRY_SERIALIZABLE) error but I am still getting the 0 error code on transaction abortion.

Thanks @tuk.
So, after modifying your code you’re no longer seeing the ERROR: current transaction is aborted error? That’s good.

I am not getting the TransactionRetryError: retry txn (RETRY_SERIALIZABLE) error but I am still getting the 0 error code on transaction abortion.

I’m a bit confused. Which statement exactly is returning this 0 error code?

@andrei - Filed an issue in github with a code to reproduce the issue-