Backup to Azure

Currently testing the backup and restore capabilities of version 20.2.4. When I run the following command example with an Access Key from the Azure Storage Account, I get the following error:

BACKUP DATABASE db_name TO "azure://backups/db_name?AZURE_ACCOUNT_KEY=access_key&AZURE_ACCOUNT_NAME=storage_account_name;

ERROR: azure credential: illegal base64 data at input byte 0

If I base64 the real Access Key, the above error doesn’t happen, but I then get the following error:

RESPONSE Status: 403 Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.

I’ve confirmed that the Access Key is good by making calls with it using the Azure CLI.

Hey, thanks for reaching out!

This error is bubbled up from the Azure SDK and we have some unit tests performing similar SQL queries that are not running into this issue on 20.2.4, which makes me believe that this might be an environment/azure configuration related issue.

I found this link with a few suggested solutions:

  • Ensuring correct timezone/time
  • Checking if you are running cockroach in a VPN

Let me know if any solutions in this link solve the issue!

Hi Aditya,

Thanks for the response. The time is synchronized to Google’s NTP servers and is accurate UTC time. The CockroachDB node is a single node, so there is no cross-VPN cluster running, albeit the node is running behind a VPN.

Also, I’m not sure I understand the Azure backup URL…there’s no storage container or file share to specify in the documentation, so where does the backup even go to? Knowing this might help with the problem.

The documented example is azure://employees?AZURE_ACCOUNT_KEY=123&AZURE_ACCOUNT_NAME=acme-co

So the backup URI is of the form:

scheme: azure
host: bucket-name
path: subdirectory inside the bucket
params: AZURE_ACCOUNT_KEY , AZURE_ACCOUNT_NAME

contructed uri: azure://bucket_name/subdir?paramkey=paramvalue

I assume by storage container you mean bucket? In case you are interested in looking at how our test runs this query - cockroach/backup_cloud_test.go at master · cockroachdb/cockroach · GitHub

Thanks Aditya,

So it appears then that the backup command is using Azure Blob Storage which makes sense. Azure Storage uses Containers for blobs and not Buckets like S3 (same concept, different name).

After having a look at the backup_cloud_test.go link you provided, I can see that the logic is asking for three environment variables:

AZURE_ACCOUNT_NAME
AZURE_ACCOUNT_KEY
AZURE_CONTAINER

I’ve tried specifying each of these variables in the SQL shell, to no avail. I’ve also tried exporting the environment variables in the Linux shell, to no avail. Also, it seems that the BACKUP DATABASE command only demands the account name and account key variables, unlike what is stated in the backup_cloud_test.go link, which also seems to demand the container name.

I’ve tried everything under the sun to get this to work, and nothing seems to work at all…same errors every time. Are you certain that this Azure feature actually works in the latest version?

I just tested this and verified database backups to azure do work. However in the process of testing it I hit what I think is the same snag that you’ve hit above: ERROR: make storage: azure credential: illegal base64 data at input byte 59. This is down to the fact that azure’s account keys, like s3’s, are base64 encoded bytes, and thus can include +, / and = (for padding). However when we pass it as a url parameter in a backup URI, it has to be url-encoded to correctly represent these characters. After urlencoding the key (i.e. replacing / with %2F, + with %2B and the trailing = padding with %3B to test it by hand), my little test worked as expected:

# lookup key, copy/paste it
az storage account keys list --account-name=cockroachbackuptest
....
# start cockroach demo
cockroach demo
...
# backup using copied key correctly url-encoded ( + / and = replaced)
demo@127.0.0.1:26257/movr> backup database movr to 'azure://backup-test/movr?AZURE_ACCOUNT_NAME=cockroachbackuptest&AZURE_ACCOUNT_KEY=abcabcabc%2Fabcabcabc%2Babcabcabc3D3D'
;
        job_id       |  status   | fraction_completed | rows | index_entries | bytes
---------------------+-----------+--------------------+------+---------------+---------
  631628035272835073 | succeeded |                  1 | 2565 |          1015 | 458371
(1 row)

Time: 1.255s total (execution 1.255s / network 0.000s)

demo@127.0.0.1:26257/movr> show backup 'azure://backup-test/movr?AZURE_ACCOUNT_NAME=cockroachbackuptest&AZURE_ACCOUNT_KEY=abcabcabc%2Fabcabcabc%2Babcabcabc%3D%3D'
;
 ....
(7 rows)

Time: 181ms total (execution 180ms / network 0ms)

demo@127.0.0.1:26257/movr>

If you try url-encoding the key that is printed by az storage account keys list before putting it in the BACKUP URI does it work?

Hi David,

Yes this works! After url encoding the key, the backup worked immediately. Thanks very much for providing a solution to that problem. It would be great if this was documented.

There are many features to the backup and restore commands which I now have to cover. For restore, it seems that it should be possible to copy or move backup data from an original location to a new location for restore to a different cluster…can you confirm this to be the case?

I was able to answer the database copy/move question by taking a backup, copying it to another storage directory, then restoring the database from the new location.