Use cockroachdb cluster timestamp in time-travel queries


(Geet Kumar) #1

I am looking to execute a query like this, but this sort of subquery is not currently allowed.
SELECT * FROM db.table AS OF SYSTEM TIME (SELECT statement_timestamp());

So my other option is to split the statement into two client side:
timestamp = select statement_timestamp();
SELECT * FROM db.table AS OF SYSTEM TIME timestamp;

But this leads to the possibility that the node which executed the first statement might be ahead of the node which executes the second.

What I need is a monotonically increasing cluster timestamp which could be used (such as cluster_logical_timestamp(), though this might come with a cost) or cockroachdb could support subqueries/variables?

Any suggestions?


(Ben Darnell) #2

Why do you want to do this? This should be equivalent to SELECT * FROM db.table. All statements explicitly execute at their statement timestamp. (There are actually some very subtle differences, but hopefully you’re not relying on those)

cluster_logical_timestamp() is only monotonically increasing in certain situations. Specifically, between two transactions that commit and touch overlapping rows, the cluster_logical_timestamp() will reflect the order in which they occurred. No guarantees are provided for the cluster_logical_timestamps of transactions which fail to commit or which touch disjoint sets of rows. Doing anything useful with this is tricky and subtle. What are you trying to do?


(Geet Kumar) #3

Thanks for the reply @bdarnell. The main reason I am trying to do this is because of this statement: https://www.cockroachlabs.com/docs/stable/performance-best-practices-overview.html#use-as-of-system-time-to-decrease-conflicts-with-long-running-queries

The select statement I am running will be a long running query and we have noticed that if we do not use AS OF SYSTEM TIME, the query does not complete at times.

When we use AS OF SYSTEM TIME, the query consistently completes successfully. Note that we are running these queries while also inserting large amounts of data.


(Taylor Wrobel) #4

If you’re running a long-running query, immediacy of data shouldn’t matter much. Also, you know the maximum clock skew of the cluster, so you should be able to work around your concerns of the second node being ahead of the first pretty easily:

Transaction 1:

SELECT now() - INTERVAL '10s';
+----------------------------------+
|          now() - '10s'           |
+----------------------------------+
| 2018-10-22 23:23:27.547356+00:00 |
+----------------------------------+
(1 row)

Time: 15.215889ms

Transaction 2:

SELECT * FROM db.table AS OF SYSTEM TIME '2018-10-22 23:23:27.547356+00:00';

I used a 10 second offset to be safe, but any value longer than your configured max clock skew should work. Note that this won’t definitely stop conflicts tho, but will reduce them. If you have any writes that were started before your offset timestamp that have yet to complete, the query may be forced to restart anyways.

Tho I would say that, since this seems to be a fairly common query pattern, it’d be nice if the syntax supported it directly, i.e.:

SELECT * FROM db.table AS OF SYSTEM TIME (now() - INTERVAL '10s');

(Geet Kumar) #5

@twrobel precisely what we are doing now. The main reason I opened this thread is to see if something like the following could be supported so that we shouldn’t need to worry about clock skew coming in the way (since I would assume the subquery would also occur on the same node…not sure if this is a correct assumption).


(Ben Darnell) #6

Oh, that’s interesting. The docs should be recommending “Use the AS OF SYSTEM TIME clause with a time at least a few seconds in the past”. AS OF SYSTEM TIME at the current time is not supposed to make a difference, except for the “subtle differences” I mentioned. Those aren’t really supposed to affect things much, but it appears that they are. We’ll do some more research about this.

Also, I have to warn you that using AS OF SYSTEM TIME without going at least 500ms in the past (the default --max-offset) can return some odd results. Specifically, if you write a value on one connection and then read with AS OF SYSTEM TIME now(), you may not see the value you just wrote.

In CockroachDB 2.1, there will be shorthand for this: AS OF SYSTEM TIME '-10s' will use the current time minus 10 seconds.


(Geet Kumar) #7

My mistake, it seems I wasn’t clear. We were using a time 2 seconds in the past when running these tests.

Thanks for this and the overall clarification!