CRDB and RocksDB Column Family

Hi CRDB developers,

Since CRDB is using RocksDB as the lower-level storage engine, I am wondering if different CRDB tables are managed by a single RocksDB column family or different ones. How this storage strategy would eventually affect the multi-tenancy experience.

Thanks
Aaron

Hi Aaron,

CockroachDB uses a single RocksDB column family for all data storage. Using a separate column family per table would have been problematic. In RocksDB, column families share the WAL (write-ahead-log), but have separate memtables and sstables which would lead to a much larger number of memtables and sstables, in scenarios with relatively small table sizes. Using a single column family avoids those problems, but does mean that table data is intermingled with sstables so we can’t easily ship sstables around to move table data but instead need to iterate over the table data at a logical level.

As to multi-tenancy, I imagine it only increases the number of tables you’d see in a cluster. Table data from different tenants could wind up in the same sstables, but is that problematic? Your gmail user data might be in the same files as mine within Google’s systems, but we’ll never know as the various layers above the files prevent us from accessing each other’s data.