Hi, I’ve been evaluating the performance of CockroachDB for a project at my workplace. Our requirement is that in a single table we need to store up to 10^12 records, which can be queried by N properties. The volume of the properties is determined at application runtime, and so we needed a flexible method of storing these properties and then being able to quickly retrieve sets of records based on a subset of the values.
After trying various options we settled on using a JSONB typed column to store the properties, so the table schema looks like this:
CREATE TABLE cell events (
ckm_partition INT8 NOT NULL,
revision_partition INT8 NOT NULL,
revision INT8 NOT NULL,
value DECIMAL(18,8) NOT NULL,
composite_key JSONB NOT NULL,
event_sequence INT8 NOT NULL,
CONSTRAINT cellevents_pk PRIMARY KEY(revision_partition DESC, revision DESC, event_sequence ASC)
);
I’m running the CockroachDB cluster on GKE with 5 nodes. Each node has the spec n1-highcpu-32, and each has local SSD configuration, and this is running on version 2.1.5 of CockroachDB.
Having run load tests for both read and write biased loads, I’ve found that reads are fine with an average read time of ~6ms, which is great. However, writes are not quite as good, considering that the rows being inserted will only ever be done in singularly, I’m unable to do any batching. In the load test I’ve performed up to 1 million individual row writes into a non-empty table, and the average write speed is ~48ms which means a writes/sec of ~21. This just isn’t fast enough at all.
Another experiment I’ve tried, is to create a very simple table such as:
CREATE TABLE test (
col1 INT8 NOT NULL,
col2 INT8 NOT NULL,
col3 INT8 NOT NULL,
CONSTRAINT test_pk PRIMARY KEY(col1, col2, col3)
);
Just inserting a single row into the empty table takes ~13ms, this curiously slow.
So the question is: Am I doing something wrong, and is there some configuration that can explain the slow speed of the inserts?
Thanks