I have a database with some tables. One table is called “log” and contains one row per edit made to the database. So every time I INSERT, UPDATE, or DELETE, an entry is added to “log”.
This entry is added within a transaction that also contains that the main query.
What I want is a way to maintain the order of the log, so that if another program reads the logs and replays the updates, it would end up with the same database (the exact same values).
I have done some tests and I have a problem. Let’s say I send two transactions to my CRDB cluster at the same time.
SET key = ‘a’
ADD ‘a’ to log with timestamp
SET key = ‘b’
ADD ‘b’ to log with timestamp
I understand that if those transactions are committed at the same time, key might be set to a or b. Both values are fine.
But if we look at the log, is the most recent entry to the log going to have the same value as key?
I did some tests and it turned out that it is not the case.
Here are some results I obtained:
root@:26257/tests> SELECT * FROM "log" ORDER BY "id"; SELECT * FROM "store";
+--------------------+----------------------------------+---------------------+
| id | at | update |
+--------------------+----------------------------------+---------------------+
| 344292566887596033 | 2018-05-01 02:02:33.076175+00:00 | 5577006791947779410 |
| 344292566893297665 | 2018-05-01 02:02:33.079184+00:00 | 3916589616287113937 |
| 344292566894804993 | 2018-05-01 02:02:33.079656+00:00 | 3510942875414458836 |
| 344292566895296513 | 2018-05-01 02:02:33.080314+00:00 | 2015796113853353331 |
| 344292567201415169 | 2018-05-01 02:02:33.078381+00:00 | 1874068156324778273 |
| 344292567440588801 | 2018-05-01 02:02:33.079099+00:00 | 3328451335138149956 |
| 344292567599939585 | 2018-05-01 02:02:33.07775+00:00 | 5263531936693774911 |
| 344292567750901761 | 2018-05-01 02:02:33.079097+00:00 | 7955079406183515637 |
| 344292568001249281 | 2018-05-01 02:02:33.079326+00:00 | 2703501726821866378 |
| 344292568245469185 | 2018-05-01 02:02:33.079329+00:00 | 2740103009342231109 |
| 344292568373788673 | 2018-05-01 02:02:33.079078+00:00 | 6941261091797652072 |
| 344292568481693697 | 2018-05-01 02:02:33.079401+00:00 | 1905388747193831650 |
| 344292568588353537 | 2018-05-01 02:02:33.079695+00:00 | 7981306761429961588 |
+--------------------+----------------------------------+---------------------+
(13 rows)
+-----+---------------------+
| key | value |
+-----+---------------------+
| key | 7981306761429961588 |
+-----+---------------------+
(1 row)
The log says that the last transaction was done with the value ‘2015796113853353331’, yet key is equal to ‘7981306761429961588’ (from an earlier entry to log).
As you can see, if a user wants to replay the updates and does it by applying each update in the order defined by the timestamp, they wouldn’t end up with the same key. I think it makes sense for a distributed database like Cockroach.
But it seems to work when using the id column (which is a SERIAL), but I’m not sure it’s reliable. This test was one on a single node cluster. It might be different when many nodes are involved.
So I’m not sure. I hope my explanation makes sense.
What can I do to have a list of events ordered in such a way that I can reproduce the state of the database from replaying the log?