How to have a correct log table that contains all the INSERTs, UPDATEs, and DELETEs

(Marc) #1

I have a database with some tables. One table is called “log” and contains one row per edit made to the database. So every time I INSERT, UPDATE, or DELETE, an entry is added to “log”.

This entry is added within a transaction that also contains that the main query.

What I want is a way to maintain the order of the log, so that if another program reads the logs and replays the updates, it would end up with the same database (the exact same values).

I have done some tests and I have a problem. Let’s say I send two transactions to my CRDB cluster at the same time.

SET key = ‘a’
ADD ‘a’ to log with timestamp

SET key = ‘b’
ADD ‘b’ to log with timestamp

I understand that if those transactions are committed at the same time, key might be set to a or b. Both values are fine.

But if we look at the log, is the most recent entry to the log going to have the same value as key?

I did some tests and it turned out that it is not the case.

Here are some results I obtained:

root@:26257/tests> SELECT * FROM "log" ORDER BY "id"; SELECT * FROM "store";
|         id         |                at                |       update        |
| 344292566887596033 | 2018-05-01 02:02:33.076175+00:00 | 5577006791947779410 |
| 344292566893297665 | 2018-05-01 02:02:33.079184+00:00 | 3916589616287113937 |
| 344292566894804993 | 2018-05-01 02:02:33.079656+00:00 | 3510942875414458836 |
| 344292566895296513 | 2018-05-01 02:02:33.080314+00:00 | 2015796113853353331 |
| 344292567201415169 | 2018-05-01 02:02:33.078381+00:00 | 1874068156324778273 |
| 344292567440588801 | 2018-05-01 02:02:33.079099+00:00 | 3328451335138149956 |
| 344292567599939585 | 2018-05-01 02:02:33.07775+00:00  | 5263531936693774911 |
| 344292567750901761 | 2018-05-01 02:02:33.079097+00:00 | 7955079406183515637 |
| 344292568001249281 | 2018-05-01 02:02:33.079326+00:00 | 2703501726821866378 |
| 344292568245469185 | 2018-05-01 02:02:33.079329+00:00 | 2740103009342231109 |
| 344292568373788673 | 2018-05-01 02:02:33.079078+00:00 | 6941261091797652072 |
| 344292568481693697 | 2018-05-01 02:02:33.079401+00:00 | 1905388747193831650 |
| 344292568588353537 | 2018-05-01 02:02:33.079695+00:00 | 7981306761429961588 |
(13 rows)

| key |        value        |
| key | 7981306761429961588 |
(1 row)

The log says that the last transaction was done with the value ‘2015796113853353331’, yet key is equal to ‘7981306761429961588’ (from an earlier entry to log).

As you can see, if a user wants to replay the updates and does it by applying each update in the order defined by the timestamp, they wouldn’t end up with the same key. I think it makes sense for a distributed database like Cockroach.

But it seems to work when using the id column (which is a SERIAL), but I’m not sure it’s reliable. This test was one on a single node cluster. It might be different when many nodes are involved.

So I’m not sure. I hope my explanation makes sense.

What can I do to have a list of events ordered in such a way that I can reproduce the state of the database from replaying the log?

(Rebecca Taft) #2

Hi @mfcl,

As you mention above, the sorts of guarantees you are looking for are not generally possible in a distributed database like CockroachDB. However, we are currently working on a feature called Change Data Capture (CDC), which I think will fit your use case perfectly: it will allow clients to track every change made to the database, in the order they happened. There is no official documentation yet, but this issue contains some of the discussion around the proposed feature: We are aiming to release CDC to production in our next release (v2.1), but it will likely be available as an alpha release in early June.

If you simply want a consistent copy of the database, you might also consider our existing backup and restore features:

But if you still want to continue with your current approach, neither now() to get the current timestamp nor SERIAL can guarantee a correct ordering. cluster_logical_timestamp() will guarantee a proper ordering for transactions that touch the same key, although it will cause some performance overhead.

I think your best bet is to wait for the CDC feature in our next release.

– Becca

(Marc) #3

Thanks a lot @becca, this looks exactly what I was trying to achieve. I’ll keep an eye on that feature.

(Rebecca Taft) #4

Great! Glad to hear it.