Needs clarification of bytes numbers in node stats

I need a help in understanding the bytes stats of a node in the web UI.

I am running a 2-node Cockroach-DB cluster in kubernetes with the official Docker image, and one of the node has a stats like this:

Why does the live bytes shows such a large number? The strange thing is the live bytes differs very much from the size of CockroachDB’s data directory. I tried to confirm this by running du -sh . at the /cockroach/cockroach-db and it shows this:

root@cockroachdb-0:/cockroach/cockroach-data# du -sh
310M	.

I wonder where does the 2.2 GiB come from? Thanks.

1 Like

Hi @hadi,

It’s hard to say precisely without being able to investigate and without knowing more about the cluster.

  1. Have you inserted any data into it? If so, how much?
  2. How long has the cluster been running?

If the cluster has been running for quite a while, the 2.2GiB could be due to the cluster writing its timeseries data to disk.

The difference between 2.2GiB and 310MiB is more than I’d expect, but there is some compression done on data written to disk which could make the “Live Bytes” larger than the actual space consumed on disk.

1 Like

Thanks for your reply!

I have inserted a test data which consists of only 1 table with 1 entry, with total database size of 1.2 KiB according to the web ui. The cluster has been running for about 2 weeks, so maybe the database is mostly filled with timeseries data and it is highly compressible?

Thanks for the clarification!

Hi
I have the same question on this.
I’m using Cockroach DB v1.0.6 on Windows10 with binary version. And have 3 nodes running on one computer.
I’ve been running the nodes for about 2 days.

Live bytes of each node are 1.1GiB in Admin UI, and data directory size of each node from the filesystem is about 100MiB but not same like 99, 111, 122MiB. I dumped my database to know approximate data size of what I store without thinking of replicas and time series data. And its size is 332KiB.

And also, my current value of ‘Replicas per Node’ is 33. each maximum chunk size is default 64MiB, so current maximum size with same ‘Replicas per Node’ value is 2112 MiB per Node. In this case, the average chunk size is about 33MiB(1.1GiB / 33 replicas per node) currently? And default replicas per range is 3, so it has about 11 ranges now.

I assume that current size of each Node is 1.1GiB, and it compressed to about 100MiB. Am I correct even a quite big difference? I think I didn’t get the point much. Can you explain about this a bit more?

Thanks :slight_smile:

  • version
    Build Tag: v1.0.6
    Build Time: 2017/09/14 15:19:52
    Distribution: OSS
    Platform: windows amd64
    Go Version: go1.8.3
    C Compiler: gcc 6.3.0
    Build SHA-1: 8edb4c9235a7bb8aa80761375db12bb4a7e5afeb
    Build Type: release

Hi @jykeith ,

The numbers on the Nodes page (/#/cluster/nodes) are supposed to be the actual disk usage for each node. At least, that’s the case as of v1.1, I’m not sure if it was different or broken in v1.0. Are you sure that the actual filesystem space taken up by each store is different from what is shown on the Nodes page?

The relationship between the “Live Bytes” graph and the actual storage on disk is a bit tougher to track. It can be less than or greater than the actual storage being used on disk depending on your workload. The actual disk usage could be less than the live bytes if the data is highly compressible, or it could be greater than the live bytes due to what’s known as space amplification in the underlying storage engine.

Sorry for late reply.
I upgraded to v1.1.2 and the problem shown above is disappear. :slight_smile: BYTES shows real size of data directory (although ‘size on disk’ is a little bit larger than ‘size’ of directory) and all directory size is balanced.
thank you

1 Like