How can the feature "backup and restore" guarantee the completeness of backup data?

sql

(yangliang) #1

I have a trouble, if I backup a table and restore it into another cluster, How can I confirm the two table is identical and similar in every detail . Maybe the new table is Incomplete. whether can cockroach provide a function like checksum for table ? then I can be sure, the backup an restore is OK .


(Bob Vawter) #2

The RESTORE job guarantees that all data in a snapshot has been loaded into a range on at least one node in order for it to be marked as successful. Once those ranges have finished up-replicating, you would have a fully-consistent restore.

We use backup-and-restore to load test fixture data as part of continuous load-tests of our code. These code-paths are well-exercised.

There are hash functions built into CockroachDB, however there aren’t any aggregate hashing functions. It would be a relatively straightforward process to write a small program which verifies your restored data by performing a full table scan in deterministic order and comparing hashes of row information.


(Bob Vawter) #3

It was pointed out to me that there exists an experimental feature in the 2.1 branch which can fingerprint an entire table.

SHOW EXPERIMENTAL_FINGERPRINTS FROM TABLE t;

https://www.cockroachlabs.com/docs/v2.1/experimental-features.html#show-statement-fingerprints

It’s also possible to hack together an aggregate hashing function, using fnv64 in this case:

select xor_agg(fnv64(col)) from tbl;


(yangliang) #4

Thank you very much. it really helped me.