Complete example for CREATE SCHEDULE FOR BACKUP and a restore?

Hi. I like to use the built-in backup solution and the documentation is confusing me.

How does a CREATE SCHEDULE FOR BACKUP statement look like to get the following:

  • Full cluster backup (no incremental backup)
  • every 6 hours
  • always keep the last 4 backups (24 hours)

And also, in a cluster, how is such a backup then restored? I mean, what are the steps in detail? Do I have to shut down or clear any nodes before? Do I have to restore on every node? Or can it all be done on one node?

Thanks!

Hm. No answers?

In the meantime there also raised the question about compression. Are the backups created using CREATE SCHEDULE FOR BACKUP compressed?

Hey!

Apologies for the delayed response. The following schedule should give you full cluster backups at minute 0 past every 6th hour.

CREATE SCHEDULE <label>
FOR BACKUP INTO <location>
RECURRING '0 */6 * * *' FULL BACKUP ALWAYS

There are further configurable schedule options you can append to the query above using a WITH SCHEDULE OPTIONS clause as shown in our docs - CREATE SCHEDULE FOR BACKUP | CockroachDB Docs

Regarding retention of backups, we do not delete files once they have been written to <location> by the backup. You would have to either manually delete backups every day or set an appropriate (24h) retention period on the bucket being backed up to.

Backup stores its data in SST files which are compressed.

To RESTORE a cluster from a cluster backup one needs to run a RESTORE FROM <directory containing the backup>
SQL query on a single node. The execution itself will be internally distributed to all nodes by CockroachDB. You do not need to shut down the node, you require an active SQL connection to be able to run the RESTORE query. The only requirement is that the cluster does not have any user-created databases or tables when running a full cluster restore, otherwise the job will fail. Our RESTORE docs have all of this information for future reference - RESTORE | CockroachDB Docs

Thank you Aditya, this is very helpful.

This is confusing me:

Saying that I have to drop all data that is still in the running nodes before restoring? For example, I make a backup and two hours later I want to time-travel my cluster back to the saved backup state. It will not work until I drop all databases, schemas and tables before?

I mean, after the cluster was init, I created schema, user and tables. So all my tables and databases are user-created…

As <location> I can use some local folder like nodelocal:///home/cockroach/backup/, right?

A cluster restore is used to bring back the exact state of your entire cluster as of a particular point in time. The restore brings back:

  • All user tables
  • Relevant system tables (containing cluster configurations etc)
  • All databases
  • All tables
  • All views

To be able to perform this wholesale “rewrite” of the cluster we cannot have any databases/schemas/tables already present in the cluster being restored into. In essence, the cluster needs to be wiped clean before a cluster restore. Note: defaultdb is not considered a user-created database as it is created by the cluster on boot, and should not be dropped when wiping the cluster.

Having said that, maybe a database or a table restore is more appropriate for your use case? You can restore individual databases or tables from a full cluster backup without wiping your current cluster using RESTORE TABLE... or RESTORE DATABASE...
Egs:

  • Your full cluster backup has databases: foo, bar, and baz
  • The cluster you are restoring into has databases: foo, bar
  • You can restore baz from the full cluster backup into the second cluster, without dropping any objects.

We support nodelocal as a local store on a particular node, but we do recommend using cloud/remote storage in production. Use Cloud Storage for Bulk Operations | CockroachDB Docs might be helpful.

Thanks again for that clarification. So best to drop the database and then restore. I think this is good for me.

For the storage, I run my own three servers to AVOID any of the cloud storages (data privacy). In my case, all three nodes (Nürnberg, Falkenstein, Helsinki) will do a local backup on an external volume. No matter what node is completely on fire, the two other nodes will have a backup. In particular, I mainly need the backup for human disaster recovery like in DELETE FROM table; accidents - which hopefully never ever happen.

I hope that nodelocal:// will do the job?

Is there any information about the impact on performance of such backup? Does it block all tables for a while?

Yeah, that should work, you’d need to create three schedules in that case each pointing to nodelocal://<nodeID>... so that they backup to the external volume on each node.

This write-up about backup performance might be of interest to you - BACKUP | CockroachDB Docs. Since you are using scheduled backups, we implicitly run the backup with an AS OF SYSTEM TIME when a backup is scheduled to run at the 6th hour, so this should prevent contending with transactions. Backups do not block tables, the tables can continue to serve online traffic while the backup is running.

BACKUP tends to be constrained by disk read throughput and/or IOPS, so I’d monitor those variables. We distribute work across all nodes so if one is seeing maximum CPU or disk utilization, then more nodes or more resources per node can increase throughput.

Thank you Aditya, you have been a great help to me.

I think my backup questions are now all answered :slight_smile:

No problem :slight_smile: Thanks for your questions, I’m sure they’ll help others in this forum too!