Good use case for CockroachDB?

I’m currently trying to scale up a mysql based application and am considering Galera which seems to offer pretty nice redundancy but not real horizon scaling in terms of distributing data. My initial idea had been to manually shard mysql (at the application level) but this will required complex map-reduce operations to distribute queries over all shards. So I’m interested in seeing if cockroachdb is a good alternative.

The application is relational but mostly in a hierarchical way – there are “documents” and “objects” and each of these is represented as a series of hierarchical tables (e.g documents have tables for documents, paragraphs, sentences, etc…). In some sense I could have used a document store to squash this hierarchy but there are also inter-doc and object-doc joins (which would be hard with nosql).

I mention this because while the application does a lot of joining its mostly local. There are some document-object joins and some object-object joins which are non-local but in general most joins are local. By local here I mean that if I were to manually shard most joins would be in the same shard (e.g. specified by object or document id).

So my questions are:

  • Is cockroachdb likely to be able to handle this case which is relatively join heavy but where joins can be forced to be mostly local by appropriate sharding?

  • Following up on that can i encourage or show cockroachdb how I want it to shard? We have app-generated doc and object ids that would make good sharding keys.

  • Can I shard on a hieararchy of keys like in cassanda? i.e. if documents are grouped can I encourage the shards to be based on the group_id and then the doc_id?

  • Does cockroachdb use foreign keys as a hint when sharding? E.g. some documents will have FKs to some objects so it would be useful if they share the same shard (to speed up joins).

  • How will single server performance compare to mysql or percona? I understand this is a hard question but I mean in vague terms.

  • I know cockroachdb support postgres’s SQL variant and I don’t think I do too much that’s mysql specific but are there any big mysql gotchas?

Sorry for the long question and appreciate any help with this.

Hi @masq, as you’ve pointed out it’s hard to give a general answer without having all the details, but here’s an attempt:

  1. It depends. We encourage the use of interleaved tables to establish locality for joins, but this is not always possible. We’d have to see a sample schema and some desired join operations.
    2.+3: the key is essentially an encoded version of the columns in the primary key, in the same order. The longer the shared prefix of two rows, the closer they will be together in the keyspace, and the more likely they will be close. So yes, you would specify PRIMARY KEY(group_id, doc_id).
  2. no, the sharding happens based on the primary key
  3. it’s a hard question. We aim to be on par and this doesn’t seem outlandish, though we’re not there for many workloads. And definitely, once you scale out, writes will pay approximately the internode latency. Also note that a lot of performance work is going into the next major version of CockroachDB (i.e. the successor to 1.1). On the other hand, once 3+ nodes are running in the cluster, survivability goes up a whole lot without requiring a lot more operational overhead.
  4. We support a sane subset of Postgres and are pushing for more but not everything. For example, we have ARRAY and (soon) JSON support, but no CTEs (common table expressions) or stored procedures.

Hope that helped! As I said, generally the more concrete information you can supply, the better we can respond.

Hi Tobias, thanks for your detailed response!

I can be more detailed but I just didn’t want to overload the original question. Internally we represent documents using a few tables. There’s a document table with an id and then there’s a paragraph table (one record per paragraph in the document) and a sentence table (one record per sentence). We also have sub-sentence info (info about words or other things in the sentence) in additional tables. So this is really a 1-to-N hierarchy that’s about 5 layers deep.

The tables all link upwards in the hierarchy (e.g. sents has an FK to its parent paragraph and paragraphs have FKs to their parent doc) but we also denormalize keys heavily to reduce joins so e.g. all document-related tables have the doc_id in them making it a natural sharding key. This reduces joins at the expense of larger indices and ultimately this might be a bad idea because the indices will use a lot of RAM (we need to test it). But in all cases the joins I’m talking about will be local. E.g. we might join the doc to para to sent table on the hierarchical keys. If it will be helpful I can send you some sample queries by email.

At its core our data model has two “things” (documents and objects) that we represent hierarchically and then there are a few relations between these things. It sounds like interleaved tables would be a really good match for how we represent these things because really they are “documents” in the nosql sense.

I see from the interleaving page that accessing ranges of interleaved descendant keys can be slow but if I’m doing pure ID lookups does interleaving introduce an additional penalty over just using a normal table?

My question regarding performance was really about single-server performance. I certainly expect per node performance to take a dip when I scale out but i also have no alternative then :slight_smile:

We don’t use anything complicated in mysql, not even the JSON features. We use non-instantiated views (which I see you support). MySQL has a few oddities in but hopefully the API is not that different than postgreses.

Hi @masq, I’ve been working closely with interleave tables and improving join performance on them lately so perhaps I can chime in here.

Interleave tables are indeed the best option for storing “document-style” data and permitting efficient joins. Interleaved table join performance will be greatly improved in the next major release following changes outlined in this RFC

Regarding scans: A full table scan on any interleaved table (document, paragraph, or sentences) is almost always worse. Pure ID lookups or scans with a filter follow the conventional wisdom of SQL indexes. For example, suppose we have the following schemas:

CREATE TABLE documents (doc_id INT PRIMARY, ...);
CREATE TABLE paragraphs (doc_id INT, para_id INT, ..., PRIMARY KEY(doc_id, para_id)
  INTERLEAVE IN PARENT documents (doc_id, para_id);
CREATE TABLE sentences (doc_id INT, para_id INT, sent_id INT, ..., PRIMARY KEY(doc_id, para_id, sent_id))
  INTERLEAVE IN PARENT paragraphs (doc_id, para_id, sent_id);

(Note how the primary keys of tables further down in the hierarchy include their ancestor’s and ancestor’s ancestor’s primary keys as prefixes: this is a requisite of interleave tables in CockroachDB. It seems like you’ve already done so with some of the tables and doc_id).

A full table scan on sentences (i.e. SELECT ... FROM sentences) will be relatively less efficient than if sentences was not interleaved. An ID lookup on sentences on just sent_id

SELECT ... FROM sentences WHERE sent_id = $1

will be a full table scan just like any other SQL database since none of the indexes have sent_id as a prefix. For example, here’s the output from EXPLAIN SELECT * FROM sentences WHERE sent_id = 1

| Level | Type | Field |     Description     |
|     0 | scan |       |                     |
|     0 |      | table | sentences@primary   |
|     0 |      | spans | ALL                 |
(3 rows)

NB the ALL description under spans which denotes a full table scan. On the contrary, if you created an index

CREATE INDEX sentidx ON sentences (sent_id)

this would create the following plan (again using EXPLAIN)

| Level |    Type    | Field |    Description    |
|     0 | index-join |       |                   |
|     1 | scan       |       |                   |
|     1 |            | table | sentences@sentidx |
|     1 |            | spans | /1-/2             |
|     1 | scan       |       |                   |
|     1 |            | table | sentences@primary |
(6 rows)

it indeed does an ordinary index lookup which is just as efficient as if the table was not interleaved. Of course, this will incur a small overhead on writes as with all indexes.

In some cases, you probably have doc_id and para_id (especially in a 1-N hierarchy) which you can specify to perform an efficient ID lookup that re-uses the primary index

SELECT * FROM sentences WHERE doc_id = 1 AND para_id = 2 AND sent_id = 3

which produces the following plan

| Level | Type | Field |                 Description                 |
|     0 | scan |       |                                             |
|     0 |      | table | sentences@primary                           |
|     0 |      | spans | /1/#/52/1/2/#/226/1/3-/1/#/52/1/2/#/226/1/4 |

It’s a little hard to grok, but the # characters separate each ancestor and the value(s) before it is the primary key value (i.e. the FK) for that ancestor. This query would be an efficient point lookup.

It would be great to see some sample queries you’re running to help guide our roadmap on interleave tables!