Sometimes my 700,000 row table seems empty!

I was following the cockroachdb instructions and now have three nodes running on my Debian machine. Securely as it happens.

I’d like to investigate using cockroachdb from node.js so I knocked up a quick test using pg that is basically doing insertions on the example bank.accounts table 100 times per second. Like so:

setInterval(function () {
    id = id + 1
	const query = {
    text: 'INSERT INTO accounts(id, balance) VALUES($1, $2)',
    values: [id, 666]
  }
  pool.query(query, (err, res) => {
    if (err) {
      console.log(err.stack)
    } else {
      console.log(res.rows)
    }
  })
}, 10)

Meanwhile I watch the progress in the cockroachdb shell every now and then:

root@:26257/> SELECT * FROM bank.accounts ORDER BY id DESC LIMIT 1;
+--------+---------+
|   id   | balance |
+--------+---------+
| 788130 |     666 |
+--------+---------+
(1 row)
root@:26257/> SELECT * FROM bank.accounts ORDER BY id DESC LIMIT 1;
+----+---------+
| id | balance |
+----+---------+
+----+---------+
(0 rows)
root@:26257/> SELECT * FROM bank.accounts ORDER BY id DESC LIMIT 1;
+--------+---------+
|   id   | balance |
+--------+---------+
| 875072 |     666 |
+--------+---------+
(1 row)

See, what happened there? There was 788130 rows, then none, then 875072.

I’m no SQL expert so maybe there is something obvious I am missing about this. As a non expert it is rather surprising.

By the way, having seen a lot of cockroachdb videos and read a lot of the excellent documentation I’m really rooting for cockroachdb to be a success. It’s just that in my naivety I have a habit of unearthing the most silly bugs!

Oops, it just happened again:

root@:26257/> SELECT * FROM bank.accounts ORDER BY id DESC LIMIT 1;
+---------+---------+
|   id    | balance |
+---------+---------+
| 1101544 |     666 |
+---------+---------+
(1 row)
root@:26257/> SELECT * FROM bank.accounts ORDER BY id DESC LIMIT 1;
+----+---------+
| id | balance |
+----+---------+
+----+---------+
(0 rows)
root@:26257/> SELECT * FROM bank.accounts ORDER BY id DESC LIMIT 1;
+---------+---------+
|   id    | balance |
+---------+---------+
| 1102235 |     666 |
+---------+---------+
(1 row)
  1. Which version are you using?

  2. Can you look at the log files and see if any suspicious messages were generated around the time the issue happeend? If so, that could help troubleshoot the issue.

  3. Are you able to reliably reproduce this issue? If so, please:

    • file an issue on github, and provide details;
    • try to use “set distsql=off;” once in the session where you observe the changes with SELECT. Does this make the problem go away?

Side note: are you running two or more nodes on one physical computer? If so, this is rather discouraged. It will impact performance negatively, use more memory (and increase chances of crashes due to memory limitations) and not provide any redundancy benefits.

This is my version info:

v1.0.4
Build Time:   2017/07/27 17:54:36  
Distribution: CCL
Platform:     linux amd64
Go Version:   go1.8.3
C Compiler:   gcc 6.3.0
Build SHA-1:  5b757262d33d814bda1deb2af20161a1f7749df3
Build Type:   release

I have tried looking at the logs…

I170821 12:24:37.689068 26 server/status/runtime.go:225  [n1] runtime stats: 102 MiB RSS, 168 goroutines, 29 MiB/16 MiB/57 MiB GO alloc/idle/total, 16 MiB/3
7 MiB CGO alloc/total, 2962.98cgo/sec, 0.38/0.10 %(u/s)time, 0.00 %gc (4x)
I170821 12:24:47.689177 26 server/status/runtime.go:225  [n1] runtime stats: 104 MiB RSS, 166 goroutines, 44 MiB/3.8 MiB/57 MiB GO alloc/idle/total, 18 MiB/
38 MiB CGO alloc/total, 2923.49cgo/sec, 0.37/0.10 %(u/s)time, 0.00 %gc (3x)
I170821 12:24:57.689063 26 server/status/runtime.go:225  [n1] runtime stats: 105 MiB RSS, 171 goroutines, 22 MiB/21 MiB/57 MiB GO alloc/idle/total, 19 MiB/3
9 MiB CGO alloc/total, 2972.11cgo/sec, 0.41/0.09 %(u/s)time, 0.00 %gc (5x)
I170821 12:25:07.689872 26 server/status/runtime.go:225  [n1] runtime stats: 97 MiB RSS, 166 goroutines, 29 MiB/17 MiB/57 MiB GO alloc/idle/total, 14 MiB/37
MiB CGO alloc/total, 2948.94cgo/sec, 0.41/0.10 %(u/s)time, 0.00 %gc (4x)
W170821 12:25:11.336183 85 storage/engine/rocksdb.go:1220  batch [3/976/0] commit took 686.083334ms (>500ms):
goroutine 85 [running]:
runtime/debug.Stack(0x196f740, 0x27893a0, 0xc420016508)
        /usr/local/go/src/runtime/debug/stack.go:24 +0x79
github.com/cockroachdb/cockroach/pkg/storage/engine.(*rocksDBBatch).commitInternal(0xc421af9d40, 0x1, 0x0, 0x0)
        /go/src/github.com/cockroachdb/cockroach/pkg/storage/engine/rocksdb.go:1220 +0x14d
github.com/cockroachdb/cockroach/pkg/storage/engine.(*rocksDBBatch).Commit(0xc421af9d40, 0x7f4c23b82501, 0x2bf6b40, 0x2794640)
        /go/src/github.com/cockroachdb/cockroach/pkg/storage/engine/rocksdb.go:1167 +0x432
github.com/cockroachdb/cockroach/pkg/storage.(*Replica).handleRaftReadyRaftMuLocked(0xc420daa300, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /go/src/github.com/cockroachdb/cockroach/pkg/storage/replica.go:2831 +0x533
github.com/cockroachdb/cockroach/pkg/storage.(*Replica).handleRaftReady(0xc420daa300, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /go/src/github.com/cockroachdb/cockroach/pkg/storage/replica.go:2707 +0xe0
github.com/cockroachdb/cockroach/pkg/storage.(*Store).processReady(0xc420450800, 0x16)
        /go/src/github.com/cockroachdb/cockroach/pkg/storage/store.go:3409 +0x131
github.com/cockroachdb/cockroach/pkg/storage.(*raftScheduler).worker(0xc42039e2d0, 0xc42056d5e0)
        /go/src/github.com/cockroachdb/cockroach/pkg/storage/scheduler.go:217 +0x270
github.com/cockroachdb/cockroach/pkg/storage.(*raftScheduler).Start.func2(0x7f4c1a0f9400, 0xc4202ea510)
        /go/src/github.com/cockroachdb/cockroach/pkg/storage/scheduler.go:168 +0x33
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunWorker.func1(0xc4201742f0, 0xc42056d5e0, 0xc4201f3d60)
        /go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:215 +0xf7
created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunWorker
        /go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:216 +0xad
W170821 12:25:11.336967 85 storage/store.go:3422  [n1,s1,r22/1:/{Table/51-Max}] handle raft ready: 0.7s [processed=0]
W170821 12:25:11.345754 78 storage/store.go:3422  [n1,s1,r2/1:/System/{-tsd}] handle raft ready: 0.6s [processed=0]
I170821 12:25:17.690146 26 server/status/runtime.go:225  [n1] runtime stats: 92 MiB RSS, 175 goroutines, 33 MiB/13 MiB/57 MiB GO alloc/idle/total, 15 MiB/36 MiB CGO alloc/total, 2820.67cgo/sec, 0.36/0.08 %(u/s)time, 0.00 %gc (4x)
I170821 12:25:27.689377 26 server/status/runtime.go:225  [n1] runtime stats: 94 MiB RSS, 175 goroutines, 34 MiB/13 MiB/57 MiB GO alloc/idle/total, 16 MiB/37 MiB CGO alloc/total, 2818.30cgo/sec, 0.36/0.09 %(u/s)time, 0.00 %gc (4x)
I170821 12:25:37.700921 26 server/status/runtime.go:225  [n1] runtime stats: 93 MiB RSS, 176 goroutines, 33 MiB/13 MiB/57 MiB GO alloc/idle/total, 17 MiB/37 MiB CGO alloc/total, 2796.51cgo/sec, 0.36/0.10 %(u/s)time, 0.00 %gc (4x)
I170821 12:25:37.702387 25 gossip/gossip.go:448  [n1] gossip status (ok, 3 nodes)
gossip client (0/3 cur/max conns)
gossip server (2/3 cur/max conns, infos 4979/261 sent/received, bytes 528299B/38220B sent/received)

That “runtime/debug.Stack” thing looks suspicious. I’m not sure it correlates with my query though.

I cannot trigger this behavior at will. It just happens every ten or twenty queries. Sometimes the erroneous result not zero but some other random number.

Setting “set distsql=off;” does not change anything. It still happens.

I am running three nodes on a single Debian machine. I do realize this is a bit silly but I’m just kicking the tires of cockroachdb for the first time and following the introductory instructions on the web site.

My next step was to spin this up on three Google Cloud instances. Well, I’ll probably continue with that plan anyway.

I will file an issue on github.

The debug.Stack stuff is just some tracing that fires when things are slower than it should be. It’s (probably) not significant here. Are you running on SSD or a spinning hard disk?

How did you create the table? Is the id column the primary key?

Sometimes the erroneous result not zero but some other random number.

What exactly do you mean here? A value other than 666? A different ID?

As an experiment, if you change the LIMIT of your select query from 1 to 10, what happens? Is it always either 10 rows or 0, or is it sometimes in between?

I just blew away all the data directories and certs and recreated everything from scratch. The problem persists.

The database is created with the sql shell as per the example in the web site tutorial. The ID is the primary key:

CREATE DATABASE bank;
SET database = 'bank';
CREATE TABLE accounts (id INT PRIMARY KEY, balance DECIMAL);

Sorry, I was not so clear. What I meant was that mostly the error shows up as an empty result. Like so:

root@:26257/bank> SELECT * FROM bank.accounts ORDER BY id DESC LIMIT 1;
+----+---------+
| id | balance |
+----+---------+
+----+---------+
(0 rows)
root@:26257/bank> SELECT * FROM bank.accounts ORDER BY id DESC LIMIT 1;
+--------+---------+
|   id   | balance |
+--------+---------+
| 332406 |     666 |
+--------+---------+

But on a couple of occasions it has shown up as one row with a much too low id.

After hammering on the shell for a minute just now I could not get such a result.

Changing the LIMIT to 10 did not help:

root@:26257/bank> SELECT * FROM bank.accounts ORDER BY id DESC LIMIT 10;
+----+---------+
| id | balance |
+----+---------+
+----+---------+
(0 rows)
root@:26257/bank> SELECT * FROM bank.accounts ORDER BY id DESC LIMIT 10;
+--------+---------+
|   id   | balance |
+--------+---------+
| 613719 |     666 |
| 613718 |     666 |
| 613717 |     666 |
| 613716 |     666 |
| 613715 |     666 |
| 613714 |     666 |
| 613713 |     666 |
| 613712 |     666 |
| 613711 |     666 |
| 613710 |     666 |
+--------+---------+

If I have a minute I will try automate this query with pg to make things more easily reproducable.

Oh, this is on an SSD.

@heater What you are experiencing is very odd. I tried to reproduce it unsuccessfully just now using the kv load generator we’ve written, running as kv --concurrency=1 --sequential. This is similar to the sequential insertion you’re performing. Then I ran the following script:

while :; do
    val=$(./cockroach sql --insecure -e "SELECT k FROM test.kv ORDER BY k DESC LIMIT 1;" --format raw | grep -v '#')
    if [ -z "${val}" ]; then
	echo "empty result"
    fi
done

As expected, I never saw an “empty result” message despite getting up to millions of rows in the table. One difference is that I was testing on 1.1-alpha. I’ll try and test 1.0.4 later, but I’d be very surprised if this is a bug we’ve fixed.

Do you mind providing the output of EXPLAIN (PLAN) SELECT * FROM bank.accounts ORDER BY id DESC LIMIT 1?

The explain query goes like this:

root@:26257/> EXPLAIN (PLAN) SELECT * FROM bank.accounts ORDER BY id DESC LIMIT 1;
+-------+---------+-------+------------------+
| Level |  Type   | Field |   Description    |
+-------+---------+-------+------------------+
|     0 | limit   |       |                  |
|     1 | revscan |       |                  |
|     1 |         | table | accounts@primary |
|     1 |         | spans | ALL              |
|     1 |         | limit |                1 | 
+-------+---------+-------+------------------+
(5 rows)

After another session of banging on “up arrow” and “return” in the sql shell I got the following interesting results:

root@:26257/> SELECT * FROM bank.accounts ORDER BY id DESC LIMIT 1;
+---------+---------+
|   id    | balance |
+---------+---------+
| 3584904 |     666 |
+---------+---------+
(1 row)
root@:26257/> SELECT * FROM bank.accounts ORDER BY id DESC LIMIT 1;
+---------+---------+
|   id    | balance |
+---------+---------+
| 3584977 |     666 |
+---------+---------+
(1 row)
root@:26257/> SELECT * FROM bank.accounts ORDER BY id DESC LIMIT 1;
+---------+---------+
|   id    | balance |
+---------+---------+
| 2403012 |     666 |
+---------+---------+
(1 row)
root@:26257/> SELECT * FROM bank.accounts ORDER BY id DESC LIMIT 1;
+---------+---------+
|   id    | balance |
+---------+---------+
| 2403012 |     666 |
+---------+---------+
(1 row)
root@:26257/> SELECT * FROM bank.accounts ORDER BY id DESC LIMIT 1;
+---------+---------+
|   id    | balance |
+---------+---------+
| 3585146 |     666 |
+---------+---------+
(1 row)

If any of you guys are near Mountain View you are welcome to pick up the very machine this is running on for analysis.

Note that @heater is using the same client session to run select k from … order by … limit, while your script starts a new session every time. Just in case this makes a diff and helps reproduce it on your end.

Yes.

It did occur to me that the issue is not with the db as such but with the sql shell.

With the following code I can get empty results pretty often:

const fs = require('fs')

const config = {
  host: 'localhost',
  port: 26257,
  user: 'root',
  database: 'bank',

  ssl: {
    rejectUnauthorized: false,
    ca: fs.readFileSync('/home/rsm/certs/ca.crt').toString(),
    key: fs.readFileSync('/home/rsm/certs/client.root.key').toString(),
    cert: fs.readFileSync('/home/rsm/certs/client.root.crt').toString()
  }
}

const { Pool } = require('pg')

const pool = new Pool(config)

let previousId = 0;

setInterval(function () {
  pool.query('SELECT * FROM bank.accounts ORDER BY id DESC LIMIT 1;', (err, res) => {
    let id = 0;
    if (err) {
      throw err
    }
    if (res.rows.length < 1) {
      console.log('Error: Empty result');
    } else {
      id = res.rows[0].id
      if (id < previousId) {
        console.log('Error: Low result')
      } else {
        previousId = id
      }
      console.log(res.rows[0].id)
    }
  })
}, 100)

Produces:

176183
176193
176203
Error: Empty result
176222
176232
176241
176251
176261
Error: Empty result
176286
Error: Empty result
176301
176310
176320
176330
176339
176349
176358
176368
176378
Error: Empty result
176398
176407
176417
176426
176436
176446
176456
176466
176475
176484
Error: Empty result
176504
176514
176523
176533
176543
176553
176563
176572
176581
176590
176603
Error: Empty result
176603
176603
176603
Error: Empty result
Error: Empty result
176630
176644
176659
176675

I have also seen the expected low result error. Not just now though.

After running the above reader code over night and logging the results I do find the out of sequence (low) id error:

624387
624396
624406
Error: Empty result
624424
624434
624444
624453
624463
624473
624483
624492
624502
624512
624522
624532
624541
624551
624561
624571
624580
624590
624599
Error: Low result
624598
Error: Low result
624598
Error: Low result
624598
Error: Low result
624598
Error: Low result
624598
Error: Low result
624598
Error: Low result
624598
Error: Low result
624598
Error: Low result
624598
624607
624607
624609
624609
624610
624623
624639

@heater Thanks for the reproduction details. We’ll try to reproduce shortly.

Ok, I reproduced the “empty result” behavior on 1.0.4 using the script I shared above and using kv --concurrency=1 --sequential. And I verified that this issue doesn’t appear in 1.1-alpha.

@nathan I’m told you might have some insight here about an issue that was fixed where we weren’t returning results for a query. Can you take a look? At worst, we could bisect to find out when this was fixed. It takes less than a minute for the “empty result” problem to appear.

@peter I’m not completely sure which issue you’re talking about, but yeah, I’ll take a look.

@knz was the one who suggested you might have some insight here.

@peter Using your script, I’m still able to reproduce this on master.

@heater Did you open an issue to this on github? If so, could you link to it here?

@nathan That’s surprising. I left my script running for 20min and didn’t see a reproduction on master. To be precise, I was running on a branch off master, but I don’t think the changes there would have fixed anything.