Cockroach Speed

Hi All,

Just testing CockroachDB labs. Product looks amazing but I’m probably doing something way off because the numbers I’m getting look a bit disappointing. I’m testing locally on a Windows10 machine.

MySQL: 0.002 s
CockroachDB: 38.605s

I know that databases are competing in a different universe but wasn’t expecting such a drop. The table is pretty simple:

CREATE TABLE IF NOT EXISTS `data_bucket` (
  `time` decimal(17,3) NOT NULL DEFAULT '0.000',
  `provider` enum('A', 'B', 'C') NOT NULL,
  `instrument` varchar(24) NOT NULL,
  `bucket` enum('1S','1M','1H','1D') NOT NULL,
  `open` double unsigned NOT NULL,
  `high` double unsigned NOT NULL,
  `low` double unsigned NOT NULL,
  `close` double unsigned NOT NULL,
  PRIMARY KEY (`time`,`provider`,`instrument`,`bucket`),
  KEY `instrument_bucket` (`instrument`,`bucket`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
/*!50100 PARTITION BY KEY (instrument)
PARTITIONS 10 */;

I’ve populated both instances with 120 million rows.
Actually kudos on the excellent IMPORT command.

The query I’ve used for above numbers looks like this:

select *
from data_bucket
where
	1 = 1
	and provider = 'A'
	and instrument = 'ES_1'
	and time >= 20061127000000.000
	and time <= 20081129000000.000
	and bucket = '1D'
limit 2000

Any idea if I’m doing something wrong or, if with this kind of data/query such is to be expected?

Many thanks,
Frankie

I was able to massively speed things up by dropping all the keys and sticking with just this revamped primary key:

PRIMARY KEY (`provider`,`instrument`,`bucket`,`time`)

Drops from 30+s to 50ms which is one order of magnitude more that MySQL but well in line with tolerable values.

With this new index it also changes distributed from true to false.
It just puzzles me why distributed, with all nodes on the same machine, has such a massive performance drop.

Any ideas?

Hey @frankielc

The best way to be able to answer that question would be to create two separate tables as tests. Having the original table, with all the additional keys set, import the 120 million rows of data. Go ahead and run an EXPLAIN (OPT,ENV) and EXPLAIN ANALYZE on the select query against that table. Then, perform the same steps on the 2nd table without the additional keys. Take the two outputs and compare the plans, which should show you significant differences.

Let me know if you have any questions regarding the suggested steps to get more information.

Cheers,
Ricardo

1 Like

Hi @rickrock, this is actually helpful. It will take me a while to benchmark and get more data (cockroachdb is a side project) but if I do find something more useful than the index I’ve posted above and have extra insights I’ll share them.

Many thanks,
Francisco