Why the inserts/sec graph looks like a sawtooth?

Hi,

Trying to do a simple task: inserting as many records into a simple:
(id INT PRIMARY KEY, value BYTES)
table on one and increasing number (up to 64, the data below is with 64) threads with:
INSERT INTO kv (id, value) VALUES (%s, %s)
where id and value is a simple counter (counting from 0).
I do a commit on every 1000th insert and I can see the following performance:
1000 4065.76878561
2000 3838.30153283
3000 3262.86654023
4000 2392.80863083
5000 2422.46199944
6000 1886.1197226
7000 1631.53480744
8000 1587.28928502
9000 1491.61781504
10000 1312.82708748
11000 1232.98456177
12000 1078.27270867
13000 985.64502901
14000 868.198800217
15000 952.938172109
16000 832.320095759
17000 718.000371813
18000 529.787860224
19000 606.342513297
20000 568.895524769
21000 501.776007586
22000 500.101527321
23000 509.101193218
24000 458.602230543
25000 422.490668911
26000 412.301217193
27000 386.184793801
28000 421.447489831
29000 412.32865737
30000 377.77449128
31000 351.885812689
32000 347.137375328
33000 315.063148531
34000 318.270844641
35000 313.281836042
36000 316.328263699
37000 276.20326402
38000 270.628525651
39000 249.896241592
40000 252.490377587
41000 242.895193514
42000 229.714802414
43000 207.375004184
44000 211.859108246
45000 213.223353169
46000 202.752859423
47000 182.487483502
48000 180.49548142
49000 180.925166294
50000 182.450063244
51000 172.776208208
52000 165.743093274
53000 169.5592672
54000 160.755368223
55000 133.693661681
56000 138.23220807
57000 126.658466976
58000 141.88518525
59000 138.411238627
60000 131.966983587
61000 124.257670518
62000 119.894546183
63000 111.391109047
64000 102.613092521
65000 625.01288977
66000 3409.89513341
67000 3297.89363688
68000 1977.28875563
69000 2155.6642732
70000 1912.72251345
71000 1669.35479377
72000 1537.13915672
73000 1390.98689901
74000 1254.41218121
75000 1071.88494198
76000 1047.7666003
77000 965.505366522
78000 927.850372518
79000 868.638419947
80000 793.45126663
81000 741.552107763
82000 568.907176514
83000 624.20142535
84000 591.323707796
85000 587.01914
86000 524.862724684
87000 447.453328048
88000 476.614893193
89000 400.352788903
90000 413.851801866
91000 370.333345606
92000 359.39510151
93000 359.547849487
94000 340.649677596
95000 300.587981724
96000 318.06371627
97000 292.999769054
98000 261.682424913
99000 274.774256291
100000 241.019901146
101000 246.347953657
102000 245.960404887
103000 224.24743146
104000 209.60770269
105000 194.371017369
106000 200.516093119
107000 199.447370158
108000 190.195864415
109000 168.014901209
110000 165.200920017
111000 173.348776265
112000 151.186390116
113000 148.136050947
114000 151.362208051
115000 152.419642713
116000 148.424130599
117000 148.134099464
118000 136.959190817
119000 138.444964457
120000 123.875916398
121000 130.37369405
122000 129.210281731
123000 119.89950554
124000 115.495100061
125000 112.844988321
126000 117.432626164
127000 111.111196471
128000 116.256400568
[…]
(number of inserted rows, rows/second)

Running 1.0.4 on Linux. Graphing this looks like a sawtooth.
What’s happening here? Disk IO is pretty much low, only one range, only one node.

@bra This is unexpected and not in line with similar workloads we run. Can you share your code? Alternately, can you try to reproduce with the kv load generator. Running kv --sequential --batch 1000 --concurrency 1 --min-block-bytes 64 --max-block-bytes 65 gives me output like:

_elapsed___errors__ops/sec(inst)___ops/sec(cum)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)
     21s        0        59108.9        58135.4     16.3     22.0     24.1     25.2
     22s        0        58112.0        58134.4     16.8     25.2     26.2     28.3
     23s        0        55032.9        57999.7     17.8     27.3     28.3     28.3
     24s        0        48964.8        57623.0     17.8     23.1    176.2    176.2
     25s        0        55808.7        57550.1     15.2     26.2     27.3    167.8
     26s        0        55214.0        57460.6     15.7     23.1     28.3     83.9
     27s        0        55010.2        57369.9     17.8     26.2     29.4     33.6

This is running on a single node on my laptop on current master, but I’d be very surprised if there is any significant difference with this workload on 1.0.4.

One difference from your description is that kv --batch 1000 performs a single INSERT with 1000 rows. That likely explains the order of magnitude difference, but doesn’t explain the sawtooth performance you’re seeing and I’m not.

There are two major differences:

  • I do single INSERTs, not a single INSERT with 1000 rows
  • therefore I collect up to 1000 rows in one transaction, then I commit them. And I do it in 64 threads just to see how it scales.

So that’s why I see the sawtooth: as the size of the transaction grows, the speed degrades badly.
Is this expected?
I’ve tried the same with mysql (I know, it’s a very different beast, but at least, cockroach here runs on a single node). It delivers constant insert rates with these sizes of a single transaction.

BTW, I’ve tried it without threads: one thread, collecting up to 10000 rows into a transaction them committing it:
1000 1661.6040408
2000 1430.84820224
3000 1096.63693621
4000 800.927129125
5000 834.052653264
6000 711.35545623
7000 632.523814816
8000 556.484303676
9000 485.300925511
10000 440.742827966
COMMIT
11000 1310.63480874
12000 1358.82142934
13000 1150.37592954
14000 945.2054918
15000 819.876398028
16000 709.450381568
17000 548.43683947
18000 560.560091789
19000 508.662214679
20000 444.688907841

So it seems the problem here is transactions shouldn’t be too big. #define big :slight_smile:
According to the doc, the default limit is 100k entries, the above is 10th of that.

Trying to do a 100k transaction make cockroach eat up a single CPU core wholly and insert performance quickly degrades:
1000 1393.76718199
2000 1349.41836691
3000 1159.11948677
4000 981.353299017
5000 804.241269466
6000 710.3443125
7000 579.778370416
8000 512.436559418
9000 499.71340642
10000 461.740819466
11000 424.658713144
12000 387.006064409
13000 361.192078256
14000 311.995823695
15000 313.135679456
16000 299.323238817
17000 277.441995575
18000 273.413274954
19000 256.724843416
20000 246.530818086
21000 231.399151708
22000 227.696865475
23000 216.46019715
24000 210.818444543
25000 199.831856111
26000 193.051532278
27000 174.483979401
28000 175.775750203
29000 183.14524754
30000 178.25785761
31000 162.13272969
32000 157.884670408
33000 149.328575672
34000 150.152631581
35000 144.296374969
36000 141.503663326
37000 138.346300352
38000 133.885225649
39000 129.480726382
40000 124.958078471
41000 128.003160234
42000 123.129479083
43000 122.699988094
44000 119.972706328
45000 114.478208398
46000 111.245904404
47000 109.607045931
48000 109.032217361
49000 105.681245279
50000 104.096919568
51000 102.402170046
52000 100.844420647
53000 98.6720722685
54000 96.8050933791
55000 95.8138460481
56000 95.7724964797
57000 94.0055394252
58000 91.7771108677
59000 90.2048950093
60000 89.0427542592
61000 88.9357297846
62000 85.6784894208
63000 86.8299091416
64000 84.0049537299
65000 84.8844982593
66000 84.0060036105
67000 83.3066125323
68000 81.7231955413
69000 80.7481410262
70000 79.0686903668
71000 79.400047205
72000 79.2252026986
73000 78.6149742762
74000 77.0556116743
75000 75.1300296572
76000 73.4404778041
77000 73.2484340615
78000 72.8686745228
79000 71.3043649151
80000 69.8773730991
81000 70.7651426725
82000 69.4116665604
83000 70.5014572943
84000 68.3038558038
85000 67.5778553977
86000 66.5433044701
87000 66.2935351908
88000 63.8010393948
89000 61.8415272529
90000 62.564332064
91000 62.9933284327
92000 64.2180863301
93000 63.4999014566
94000 62.5018394162
95000 61.6610101617
96000 62.9094621317
97000 61.4246765227
98000 60.5763518888
99000 59.3918929032
100000 57.4923550922
COMMIT

Exactly the same looks like this with mysql:
1000 13772.5881658
2000 14763.8414181
3000 14673.9669667
4000 15115.8073793
5000 16629.8093697
6000 15494.4605962
7000 15629.6263173
8000 15576.1109337
9000 14953.4885379
10000 15484.9075551
11000 14134.4665249
12000 14931.6625133
13000 14643.2289576
14000 14080.7519933
15000 16256.7402056
16000 15719.0710155
17000 13259.4348868
18000 13232.578683
19000 15839.5764334
20000 16908.1527348
21000 12999.184898
22000 17858.7413778
23000 14532.5729175
24000 15019.3511423
25000 14225.9365408
26000 14950.610243
27000 14622.4005634
28000 14470.302495
29000 15923.3426725
30000 16500.0157356
31000 15497.3803418
32000 15217.2086391
33000 15461.6748565
34000 14276.3432882
35000 15852.1480485
36000 16868.5762087
37000 14963.0908637
38000 15664.4158948
39000 16136.344925
40000 14925.8175866
41000 15787.8246075
42000 14582.1376546
43000 14779.8129577
44000 15124.0376884
45000 16032.352885
46000 14826.887251
47000 15090.2470966
48000 14512.3089645
49000 15779.5685575
50000 15548.9716328
51000 15335.2711265
52000 12265.1226716
53000 15165.5433745
54000 15199.6173192
55000 15704.2395378
56000 14965.8137651
57000 14733.6059239
58000 15782.3592051
59000 15098.4495873
60000 16046.9513385
61000 15762.9046139
62000 15213.7864493
63000 15763.2008178
64000 14349.0178101
65000 14387.2122938
66000 13782.0918082
67000 12492.3425168
68000 15885.1083169
69000 15605.0286667
70000 15816.2819725
71000 15853.9456227
72000 15605.9576653
73000 15479.1356817
74000 15537.1063218
75000 15591.3387729
76000 16665.2256834
77000 14989.7752411
78000 15108.6744306
79000 13680.0521853
80000 15845.7998829
81000 15708.1800954
82000 13719.5192956
83000 15865.6998143
84000 15772.6259105
85000 15491.1414378
86000 15170.8093405
87000 16078.6932504
88000 15572.6988865
89000 14918.9158426
90000 16058.3172532
91000 15788.3000388
92000 15848.8535542
93000 15935.4422012
94000 15295.1747477
95000 15516.1272423
96000 16029.7181817
97000 15511.0777458
98000 15053.2568164
99000 16345.7534909
100000 15358.631372
COMMIT

So you’re running one insert statement per row? If so, grouping rows into fewer inserts will help. crdb currently executes separate insert statements sequentially (we don’t even “pipeline” multiple inserts in any way).
You can append a RETURNING NOTHING to each insert statement (if you don’t care about it’s result - which by default is the number of rows inserted), and then we’ll process them in parallel if they operate on different tables (which is not your case).
We have some work to do here to “fuse” together such batches of inserts and speed them up.