Qualifying Filesystems

Hi,

I was wondering if some tests have been done to check if CRDB would benefit running on XFS for example vs EXT4 assuming SSD are used.

Thanks

Hi Christian,

this is a brilliant question, as indeed the filesystem behavior will likely impact the fundamentals of disk I/O performance.

Unfortunately the answer is no (not yet), we haven’t started running those tests at this time.

I do not personally know about XFS, but I’d expect that before you look at xfs to improve performance over ext4, you could check that the ext4 block size is properly aligned with the SSD page size, as this is the one single factor that makes the largest difference in I/O latency.

Beyond that, there are also other tuning aspects that are common across filesystems (including both xfs and ext4), in particular journaling options and placing the metadata/journals on separate disks.

In general I share your idea that db performance tuning via FS tuning is very rich in opportunities, but I’d like to point out that historically this type of work has been carried out orthogonally, and usually by different teams / researchers than the db tech provider.

We (at Cockroach Labs) will be glad to assist in this research if someone starts to carry it out before we do, but for now our own efforts are focused on improving performance across the board in a fs-agnostic manner.

Does this help, and do you have plans yourself to start comparing filesystems? We’d be curious to learn from your findings.

Thanks for the reply Raphael,

I just happened to read an article on how ScyllaDB is leveraging XFS and it triggered the question in my mind regarding if CRDB would benefit as well from using it and to which extend.

I would think given the storage layer is mostly built on RocksDB, there is a chance it should be the main actor regarding disk IO. Interestingly enough, facebook has a benchmark which happens to use XFS

https://github.com/facebook/rocksdb/wiki/Performance-Benchmarks

There is no comparison to EXT4 however.

There is another angle to this which is interesting to explore because CRDB does not provide encryption at rest atm. @knz is this on the roadmap somewhere?

In my use case I am looking at running CRDB on DigitalOcean droplets with the actual data being stored on encrypted XFS volumes via DigitalOcean Volumes (block storage).

I guess https://github.com/cockroachdb/cockroach/issues/19783 answers my encrypted at rest roadmap question.

Charl,

as far as CockroachDB v1.x and 2.0 are concerned, our position for now is that any encryption requirement should be addressed by looking at OS-backed solutions.

I am not myself knowledgeable about FS-level encryption but I have obtained good results (with minimal performance overhead) using block-level encryption (under the filesystem), for example using encrypted LVM volumes (on linux) or GEOM (on freebsd).