Bool array storage size

Hi,

How much storage space do bool arrays take? If I have a bool array of size 30, does it take 5 bytes + some metadata overhead, or is it drastically different?

Thank you!

At this time CockroachDB does not “compress” bool arrays into bitmaps.
The smallest unit of storage for an array element is thus always 1 byte.

So bool array values are encoded as follows:

  • one byte indicating the number of dimensions and some flags (in particular a flag indicating whether the array contains NULLs)
  • one byte indicating the element type BOOL (for other array element types this could be more than 1 byte)
  • one or more bytes (variable-length encoding) indicating the size of the array. With 30 elements, this would fit in 1 byte.
  • if the array contains NULLs, then one byte or more containing a bitmap of which array elements are NULL
  • one byte per boolean value in the array, so with 30 elements that would be 30 bytes.

(total for a 30-element bool array: 33 bytes without NULLs, 38 bytes with NULLs).

Understood, thank you!

Another question - are there plans to implement compression of bool arrays?

:slight_smile:

Unfortunately, not at this time.
Feel free to file an issue on github to document why you think this would be a good idea, eg. by highlighting an application where this would be a clear benefit.
Meanwhile, note that you can store your own custom bitmaps in a BYTES column.

Hi Raphael,

Understood, BYTES will work for me at a cost of a bit more processing on application level. I was planning to use byte arrays for archival records. These archival records would be monthly and would accumulate indefinitely and contain bitmaps. They are used to signify if there was any data from a given set of devices on a given day. I’m grouping my archive records into monthly ones to save space. Eventually I might introduce yearly archives as well, since they will save space over monthly records as well.

I’m concerned about space usage because these archives can potentially live on the devices themselves (as well as on a centralized server that would run crdb).

Thanks, :slight_smile: