Scale and K-V store API

Hello all,

I had two somewhat unrelated questions about the current state of crdb.

  1. What is the scalability limit in terms of
    a. Number of nodes a single crdb cluster
    b. Expected throughput, say from a single node - assuming just simple single row updates (or a batch of rows within the same range .i.e. no 2pc distributed transactions)

  2. Is it possible to use crdb as a pure K-V store, without any SQL layer on top? Is the distributed K-V layer exposed in the API to clients? If not, is it easy to add that support?

Cheers!
Ashwin

Thanks for your questions, @ashwinmurthy.

The answer to #2 is that it’s not possible to access the kv layer directly at this time, and I don’t think it’s on our roadmap to expose it. However, you can get essentially the same thing by creating a SQL table with two columns, k and v, and set k as the primary key.

@dianasaur323 should be able to help answer #1.

@ashwinmurthy In terms of scalability, we have spun up a 100 node cluster before. For context though, our 1.0 plan will support running a stable 64 node cluster.

Regarding your question on throughput, we are still doing work this quarter on running benchmarks against CockroachDB, and will figure out how to share these metrics soon. What are your throughput requirements? Based on your needs, I can give you a ballpark on whether or not it should work.

@dianasaur323, thanks for the info. Currently, we use Cassandra but are exploring crdb since we want strongly consistent semantics it offers. Our Cassandra clusters are currently 100 nodes but they can get bigger in the future. A single node is like 4x2TB SSD.

Is the 64 node cluster limit a function of what kind of workload one has? i.e. if I am doing lots of 2PC distributed transactions vs my workload is purely single row updates (in a K-V sense)?

What is the main scalability bottleneck today in crdb?

@ashwinmurthy 64 nodes is not a limit on the cluster size, simply the size of clusters we’re committing to test and have running smoothly. Larger clusters should work, and as @dianasaur323 mentioned, we’ve tested a 100 node cluster before. It is possible there will be a few implementation hiccups as the cluster size increases, but I don’t foresee any significant problems until we get to cluster sizes closer to 1000 nodes.

Performance of small KV operations (values smaller than 100 bytes) is dominated by the CPU. We need to do more extensive testing and characterization here, but some recent testing showed 10k inserts/sec on a single 32 CPU machine.

Distributed transactions vs single row updates do not affect scalability. The single row updates will be faster than the distributed transactions, but in either case, assuming your data is sufficiently sharded at the application level, CockroachDB will scale.

Performance and scalability are ongoing efforts. I’m not sure what I would characterize as the main bottleneck today. Perhaps the scalability of the application workload. Specifically, CockroachDB currently works best if the application workload has low contention and some natural partitioning of the data (e.g. by user or customer or account). Improving high contention scenarios is on our roadmap.