Question about DUMP and IMPORT

Hi,everyone

Recently I’m testing cockroach dump and IMPORT,and i encountered some troubles:

1.cockroach dump seems working in a single session,and cannot choose dump how many rows in one INSERT statement(Am i missing something there?).

2.When i use cockroach sql to import the .sql file cockroach dump generated,it seems also working in a single session,and can only insert 100 rows sequencely everytime.So the import speed is really slow.Inserting 10000000 rows costs about 1h.

3.I also tried to use IMPORT … PGDUMP DATA … to import the .sql file(10000000 rows) cockroach dump generated.it is stucked but didn’t throw any error(Active SQL Queries in Admin UI is 0).

Here is my IMPORT command:

IMPORT TABLE sbtest1 (                                     
id INT  PRIMARY KEY,         
k INT NOT NULL,
c STRING DEFAULT '0',              
pad STRING  DEFAULT''
 )  
PGDUMP DATA ('http://localhost:2015/sbtest1.sql');

4.When i use IMPORT … MYSQLDUMP DATA … to import the .sql file(400000 rows) mysqldump generated,a weird thing happened:the results shows the command executed sucessfully,but the target table is still empty!

Here is the result:

Here is the num of table rows:
12

And here is my IMPORT command:

IMPORT TABLE UC_USER (
ID INTEGER,
USER_NAME STRING(400),
USER_PWD STRING(800),
BIRTHDAY DATE ,
NAME STRING(800),
USER_ICON STRING(2000),
SEX STRING(4),
NICKNAME STRING(800),
STAT STRING(40),
USER_MALL INTEGER,
LAST_LOGIN_DATE DATE ,
LAST_LOGIN_IP STRING(400),
SRC_OPEN_USER_ID INTEGER,
EMAIL STRING(800),
MOBILE VARCHAR(200),
IS_DEL CHAR(4),
IS_EMAIL_CONFIRMED VARCHAR(4),
IS_PHONE_CONFIRMED VARCHAR(4),
CREATER INT,
CREATE_DATE DATE,
UPDATE_DATE DATE,
PWD_INTENSITY VARCHAR(4),
MOBILE_TGC VARCHAR(256),
MAC VARCHAR(256),
SOURCE VARCHAR(4),
ACTIVATE VARCHAR(4),
ACTIVATE_TYPE VARCHAR(4),
IS_LIFE VARCHAR(4) 
)  
MYSQLDUMP DATA ('http://localhost:2015/user.sql');

Lots of thanks for your help!

Hello,

first of all thanks for your interest in our bulk i/o features.

Yet it works in a single session. You can change the number of rows per INSERT statement by editing the constant insertRows in pkg/cli/dump.go and re-building the cockroach binary.

Yes it uses a single session. Perhaps changing the number of rows per INSERT will change the throughput but because it uses a single session the performance will always be limited. This is why we recommend to use IMPORT.

import pgdump is not compatible with cockroach dump. It was designed for pgdump.

If you want to use IMPORT with data generated from a CockroachDB database, do as follows:

  • use cockroach sql --format=csv -e 'select * from ...' to dump one table at a time in CSV format,
  • then use IMPORT CSV

You can accelerate IMPORT CSV by splitting the CSV files into multiple files that can be imported in parallel.

This may look like a bug in import mysql. Please file an issue on github and provide as much detail as necessary for us to reproduce the issue.

I hope this helps!

I forgot to mention, the fastest way to transfer data from a CockroachDB cluster to another is via BACKUP + RESTORE.

Hi @knz ,

Your reply really helps me a lot!

I’ll test BACKUP+RESTORE later and discuss with my team to determine whether purchase license or not.And I have opened a issue #28036 in github for the IMPORT MYSQLDUMP problem.

Thanks a lot for your help!

1 Like

Thank you for filing the issue!