Import in 1.2-alpha.20171204

(Alma) #21

Hi @mjibson,
I am running v2.1.3 and I am seeing this error too.

I am trying to load a single table, 29Gb sql dump file from nodelocal. I tried twice. The job progresses to about 20% and both times I got the exact same message:
Error: pq: IO error: While open a file for appending: /localdisk/cockroachdb/cockroach-tmp/cockroach-temp024783182/000006.log: No such file or directory
Failed running “sql”

In the log file on the node I was executing the statement on, I see:
E190208 15:23:31.486491 143216679 storage/engine/disk_map.go:273 [n2,import-distsql] unable to clear range with prefix [140]: IO error: While open a file for appending: /localdisk/cockroachdb/cockroach-tmp/cockroach-temp024783182/000006.log: No such file or directory

The path /localdisk/cockroachdb/cockroach-tmp/cockroach-temp024783182/ does not exist on any of the nodes. The nodes did not get restarted and nobody else was doing anything on the cluster during the load (no other users except me). I managed to successfully load smaller tables.
I run on 3 bare metal hosts. I abused them with tpc-c and they seemed to be stable.

The data is in
/localdisk/cockroachdb/cockroach-data/extern -> /NFS/dbdumps

Let me know what other info would be useful.


(Matt Jibson) #22

What mount type is /export/data? Is it just a normal directory on a local disk, or NFS or some other network storage?


(Alma) #23

I edited the original post to specify storage types.

I tried splitting the dump in smaller files. The error does not seem to depend on the file size or on the data in the file. Sometimes the same file will throw the error on one node and load on a different. Sometimes the same file that previously loaded correctly will not load.

I tried manually creating the path it is reporting as nonexistent just to see what happens :slight_smile: , there are no changes, it is still reported as nonexistent. It is always the same path.

I also tried doing

SET CLUSTER SETTING kv.bulk_io_write.max_rate = '10MB';

as specified here I did not notice any differences in error behavior.

The cockroach.log contains the same error. I do not see anything else that would indicate what is the actual error. Is there a way to get more information regarding what is happening?