What is the fastest way to load 500 milliions+ rows to CR DB?

Try to load via a SQL script (export of mysqldump) but insert is super slow. 1M row/day.
With this rate, I need > 1 year to fully load the table. Is the experimental import tool much faster?

1 Like

Hi @wilfred, yes! The import CSV capability is much faster. It goes through a different pathway to bulk import things. As of 2.0, import CSV is no longer considered experimental, since you don’t need to set up a temporary file store. Either way, both should work. Let us know if you hit issues.

Tried it using nginx as temp store, error out with

413 Request Entity Too Large

Is it an nginx or the import tool issue?

That’s an nginx issue. Try using this guide to make one that is correctly configured: https://www.cockroachlabs.com/docs/stable/create-a-file-server.html

Thanks Matt. yes, that the guide I used. I copy the config in the guide. Will try NFS and see how things go

Did you try caddy instead? That may work. Sorry about the nginx config not working. We haven’t tested it with very large files. I’ve opened https://github.com/cockroachdb/docs/issues/2845 to track fixing the nginx docs for large files.

Got it to work. I need to add the below to the nginx config.:

client_max_body_size 0;

successfully imported a 350+ million rows file. Will try a larger file and see how that goes.

For v2, can we import in batches of smaller CSV file to the same table? Also can we also initiate multi import into different tables in parallel?

thanks Matt