Import Range Replication Skew

replication
(Ron Arévalo) #21

Hey @Phil,

I think you’ve exhausted all the avenues available to you, I’ve created an issue here.

You can follow along there for any updates, feel free to provide any further input.

Thanks,

Ron

(Ron Arévalo) #22

Hey @Phil,

So from the debug zip, we are seeing that node 9 was running the job, and when we looked at the logs for node 9, it shows the following error message a few hundred times:

E190423 10:11:07.558237 409 jobs/registry.go:327 error while adopting jobs: unable to acquire lease: job-update: split failed while applying backpressure: could not find valid split key

This is usually indicative of one row being slammed over and over, and MVCC values growing too large for that row that it prevents any writes to that range. Which may be causing the import to never finish.

What we’d like you to check is the following, if you haven’t destroyed the cluster, could you run the following from the built in sql client select id, length(payload), length(progress) from system.jobs and then paste the results here.

Thanks,

Ron

(Philippe Laflamme) #23

Hey @ronarev,

Here’s the output of that query:

          id         | length  | length  
+--------------------+---------+--------+
  445377657019105289 | 2907748 |  71821  
(1 row)

Time: 19.972045ms

Thanks for opening the issue, I’ll follow along there and am happy to provide more information.

Cheers,
Philippe

(Ron Arévalo) #24

Hey @Phil

Thank you for that. Those numbers definitely seem within an acceptable range.

We think that lowering the gc.ttl for the jobs table should help. You can run the following:

ALTER TABLE system.public.jobs CONFIGURE ZONE USING gc.ttlseconds = '30s';

Once you do, can you try the import again, and hopefully, this settings change to the jobs table would allow the import to finish.

Another setting you should change should be the sstsize to 4gb on import. here’s an example of how to use that option:

IMPORT TABLE t CREATE USING $1 CSV DATA (%s) WITH sstsize = '4gb'

This should hopefully help with the import issues you’ve been having.

Thanks,

Ron

(Ron Arévalo) #25

Hey @Phil,

Also, let’s move the rest of this discussion to the GitHub issue.

Thanks,

Ron