We are running cockroachdb 2.12 on a cluster whose node has both a SSD (300GB) and a HDD (1TB).
We configured zone setting and created our stg database on SSD for daily usage ( OS is also on SSD though), then created a test database on HDD for temporary test.
when issue happens, we were doing some daily import test, so previous day’s tables on test db will be deleted, then some new data are imported into test database (on HDD) using cockroachdb’s IMPORT feature.
However, just after the IMPORT finished, the ssd usage start to jump like crazy, and finally crashed the node as there’s no more free space.
The issue happens before a long vacation, so the metrics are only remained in our grafana.
The blue line shows the usage of the SSD with stg database on it, the orange line shows the usage of the HDD with the test database on it.
The table in test db is dropped around 15:00, then IMPORT runs at 15:00 ~ 15:50.
However, the SSD usage start behave wired from 16:00, as it can use around 20% ( 60GB ) in 20 minutes then release the space.
As far as we know, the only activity of the cluster is that IMPORT, or probably also the GC of the data deleted in previous day’s test. ( Given 25h’s default TTL I think this worth mention. ), though we don’t understand why the space of SSD is being consumed like this.
Could anyone please explain what could be happening to give such behavior? Thanks.