Time series data ingest perf


(ZILIANG CHEN) #1

Hi,

I was trying to evaluate the ingest speed of CockroachDB for time series data. Here is my testing table schema which doesn’t have primary keys and partition keys.

CREATE TABLE cpuutils (region STRING, instanceid STRING, os STRING, util FLOAT, timestamp INT)

I setup a 3 nodes CockroachDB cluster in AWS and the nodes are all m4.2xlarge (8GB memory and 32 COREs) instances and write a Go data gen tool which committed metric data concurrently (10 Go routines) to CockroachDB with batch size to 1000 (1000 events in one bulk load txn). Here are the results of 2 runs

  1. Run 3 instances of these client each speak to a different CockroachDB node. What i can get is around 35K event / second in total.

./db-metrics-gen --db-type=postgres --db-user=root --db-host=dbhost1 --db-port=26257 --db-name=metrics gen --total-events=10000000
./db-metrics-gen --db-type=postgres --db-user=root --db-host=dbhost2 --db-port=26257 --db-name=metrics gen --total-events=10000000
./db-metrics-gen --db-type=postgres --db-user=root --db-host=dbhost3 --db-port=26257 --db-name=metrics gen --total-events=10000000

  1. Run 1 instance of the client. And i can get is around 35K event / second as well.

Questions:

  1. So it appears to me, the cluster is not scaling out horizontally as 3 clients have the same ingest perf as 1 in my above testing ?
  2. Is 35K event / second an expected perf for a 3 nodes cluster like above ?

Data gen tool code is located here. In case somebody like to use it for experiment

Thanks very much !


(Raphael 'kena' Poss) #2
  1. as long as there is less than 64MB of data a single range will be used. So all inserts go to the leaseholder for that range and the TPS capacity of 1 node will restrict the performance.

    You should see multi-node scalability when you start inserting data into multiple ranges simultaneously.

    You can force multiple ranges either by a) inserting to multiple tables, b) insert sufficient data upfront so that your benchmark operates after the table already contains 3+ ranges, or c) or use the ALTER … SPLIT AT statement to split ranges upfront.

  2. you will get better throughput overall by connecting more clients per node. I am not sure how many (it will depend on your workload + machine config) so you need to test it out. I wouldn’t be surprised that 10-20 clients are better than 1, but you need to do your own testing.


(ZILIANG CHEN) #3

Thanks for the reply, @knz

  1. i believe there were way more than 64MB of data.
  2. There are 30 Go routines doing data load (bulk txn) in 3 data gen instances.

One observation for the CockroachDB CPU/memory usages in different nodes is: they were quite different (one was around 250 % CPU usage, another was around 120 % and the other was around 80 %) . Look like the load was not evenly distributed ? But how comes as the 3 data gen instances were speaking to each node ?

Thanks again !