Very slow mysql import

mysql
import

(Francois PRUD HOMME) #1

Hello,

I’m actually try cockroach for a too large mysql database : our mysql servers cannot scale as cockroach can.
I have try to import a dev version of the concerned database : 31M .sql file… imported in 41 seconds without any problem :slight_smile:
Now i’m trying with the production database… 82G .sql file… not finished actually after ~21h.

I have two questions :

  • How can we speedup a large import like this one ? (we have a 3 node dev cluster… launched binary file)
  • Can we have information about import progress somewhere ?

I’m on 2.1.0 version
One node was start with “start --insecure”
Two other wih “start --join=firstnode --insecure”
.sql file were put in “cockroach-data/extern/local/”
Import launched like the documentation “import mysqldump ‘nodelocal:///local/file.sql’;”


(Francois PRUD HOMME) #2

For information, it’s now ended :slight_smile: :

root@:26257/test> import mysqldump ‘nodelocal:///local/file.sql’;

    job_id       |  status   | fraction_completed |    rows    | index_entries | system_records |    bytes

±-------------------±----------±-------------------±-----------±--------------±---------------±------------+
397995641726631937 | succeeded | 1 | 1024628573 | 1605325442 | 0 | 98904246724
(1 row)

Time: 23h32m11.293565694s

Only 23h32… any idea to speedup ?


(Ron Arévalo) #3

Hey @fprudhom,

Could you provide us with some more information regarding the import? Can you tell me how many tables you were importing, how many indexes, the schemas?

Thanks,

Ron


(Ron Arévalo) #4

Hey @fprudhom,

Just wanted to reach back out again and provide an alternate solution to importing. We have a CSV import that is at least 4x faster than the mysqldump import. The reason behind this is that the CSV import takes advantage of multiple cores if they are available, while the mysqldump import is restricted to one core for approximately half of the work.

If you wanted to try the CSV import out, the syntax looks like this:

IMPORT TABLE <TableName> CREATE USING 'location://ddl.sql' CSV DATA ('file1.csv', 'file2.csv', etc...)

You can also read more on this import statement here.

We also have a github issue that you can follow and +1 regarding the mysqldump import here

Let me know if you have any questions.

Thanks!


(Francois PRUD HOMME) #5

Hi, there is 84 tables (all have index and foreign keys). I can’t give you all details due to confidentiality problem.
We have 4 big tables : 2 with 40GB of data and ~400k lines, and 2 others with 10GB with ~100k lines


(Francois PRUD HOMME) #6

Many thx for answer about the issue, i will follow it.
I will try to do a csv import and give you result in a few days.

Thanks again :slight_smile:


(Ron Arévalo) #7

Yes, please let us know hot it goes!