I’m currently trying to scale up a mysql based application and am considering Galera which seems to offer pretty nice redundancy but not real horizon scaling in terms of distributing data. My initial idea had been to manually shard mysql (at the application level) but this will required complex map-reduce operations to distribute queries over all shards. So I’m interested in seeing if cockroachdb is a good alternative.
The application is relational but mostly in a hierarchical way – there are “documents” and “objects” and each of these is represented as a series of hierarchical tables (e.g documents have tables for documents, paragraphs, sentences, etc…). In some sense I could have used a document store to squash this hierarchy but there are also inter-doc and object-doc joins (which would be hard with nosql).
I mention this because while the application does a lot of joining its mostly local. There are some document-object joins and some object-object joins which are non-local but in general most joins are local. By local here I mean that if I were to manually shard most joins would be in the same shard (e.g. specified by object or document id).
So my questions are:
Is cockroachdb likely to be able to handle this case which is relatively join heavy but where joins can be forced to be mostly local by appropriate sharding?
Following up on that can i encourage or show cockroachdb how I want it to shard? We have app-generated doc and object ids that would make good sharding keys.
Can I shard on a hieararchy of keys like in cassanda? i.e. if documents are grouped can I encourage the shards to be based on the group_id and then the doc_id?
Does cockroachdb use foreign keys as a hint when sharding? E.g. some documents will have FKs to some objects so it would be useful if they share the same shard (to speed up joins).
How will single server performance compare to mysql or percona? I understand this is a hard question but I mean in vague terms.
I know cockroachdb support postgres’s SQL variant and I don’t think I do too much that’s mysql specific but are there any big mysql gotchas?
Sorry for the long question and appreciate any help with this.