Table data unavailable after UPDATE

Hello,

I’m trying out CockroachDB for a new service. I have configured a three node cluster and started populating a table with about 500K rows. Then I remembered I missed a column which I then added and ran a series of UPDATEs to fill in the missing data as well.

Now whenever i SELECT from the table (or even when i attempt to dump it using cockroach dump, the client just blocks forever.

Every 10 seconds I see the following in the cockroach.log:

W170808 12:03:51.383160 72881 storage/replica.go:1732  [n1,s1,r24/1:/Table/51/1/2677{3987…-4397…}] context deadline exceeded while in command queue: ResolveIntent [/Table/51/1/267741293247168513/0,/Min), ResolveIntent [/Table/51/1/267741293250150401/0,/Min), ResolveIntent [/Table/51/1/267741293268336641/0,/Min), ResolveIntent [/Table/51/1/267741293302480897/0,/Min), ResolveIntent [/Table/51/1/267741293303529473/0,/Min), ResolveIntent [/Table/51/1/267741293305397249/0,/Min), ResolveIntent [/Table/51/1/267741293307133953/0,/Min), ResolveIntent [/Table/51/1/267741293345538049/0,/Min), ResolveIntent [/Table/51/1/267741293469761537/0,/Min), ResolveIntent [/Table/51/1/267741293470810113/0,/Min), ResolveIntent [/Table/51/1/267741293782827009/0,/Min), ResolveIntent [/Table/51/1/267741293797572609/0,/Min), ResolveIntent [/Table/51/1/267741293850624001/0,/Min), ResolveIntent [/Table/51/1/267741293870776321/0,/Min), ResolveIntent [/Table/51/1/267741293871136769/0,/Min), ResolveIntent [/Table/51/1/267741293876871169/0,/Min), ResolveIntent [/Table/51/1/267741293878509569/0,/Min), ResolveIntent [/Table/51/1/267741293889585153/0,/Min), ResolveIntent [/Table/51/1/267741293910884353/0,/Min), ResolveIntent [/Table/51/1/267741293912621057/0,/Min), ... 75 skipped ..., ResolveIntent [/Table/51/1/267741295738454017/0,/Min), ResolveIntent [/Table/51/1/267741295741304833/0,/Min), ResolveIntent [/Table/51/1/267741295754969089/0,/Min), ResolveIntent [/Table/51/1/267741295763488769/0,/Min), ResolveIntent [/Table/51/1/267741295781543937/0,/Min)
W170808 12:03:51.383408 71980 storage/replica.go:1756  [n1,s1,r24/1:/Table/51/1/2677{3987…-4397…}] have been waiting 1m0s for dependencies: cmds:
[{global:<nil> local:<nil>} {global:0xc42b54a000 local:0xc42b54d000}]
global:
  534090 [/Table/51/1/267741288403075073/0)
  537221 [/Table/51/1/267741288403075073/0)
  548028 [/Table/51/1/267741288403075073/0)
  553078 [/Table/51/1/267741288403075073/0)
  555502 [/Table/51/1/267741288403075073/0)
  557623 [/Table/51/1/267741288403075073/0)
  559744 [/Table/51/1/267741288403075073/0)
  561663 [/Table/51/1/267741288403075073/0)
  563380 [/Table/51/1/267741288403075073/0)
  564895 [/Table/51/1/267741288403075073/0)
  ...remaining 62491 writes omitted
local:  531969 [/Local/RangeID/24/r/AbortCache/"00f048b4-8961-4d0b-b63b-7476f2c49054")
  531970 [/Local/RangeID/24/r/AbortCache/"00f048b4-8961-4d0b-b63b-7476f2c49054")
  531971 [/Local/RangeID/24/r/AbortCache/"00f048b4-8961-4d0b-b63b-7476f2c49054")
  531972 [/Local/RangeID/24/r/AbortCache/"00f048b4-8961-4d0b-b63b-7476f2c49054")
  531973 [/Local/RangeID/24/r/AbortCache/"00f048b4-8961-4d0b-b63b-7476f2c49054")
  531974 [/Local/RangeID/24/r/AbortCache/"00f048b4-8961-4d0b-b63b-7476f2c49054")
  531975 [/Local/RangeID/24/r/AbortCache/"00f048b4-8961-4d0b-b63b-7476f2c49054")
  531976 [/Local/RangeID/24/r/AbortCache/"00f048b4-8961-4d0b-b63b-7476f2c49054")
  531977 [/Local/RangeID/24/r/AbortCache/"00f048b4-8961-4d0b-b63b-7476f2c49054")
  531978 [/Local/RangeID/24/r/AbortCache/"00f048b4-8961-4d0b-b63b-7476f2c49054")
  ...remaining 72490 writes omitted

The remaining writes omitted numbers varies each time (both lower and higher).

I could just wipe the systems and start over, but I’m curious as to what could have caused this.

New tables and databases work fine.

Cheers,

Christian

Thanks for reporting this. We’re investigating internally. I would help if you could condense the steps to see this problem to a smaller example, and file a github issue.