Distributed Join Algorithms in CRDB


(Tim O'Brien) #1

Gitter user @kahole asks:

04:49 Hi everyone, I’m trying to gain some insight into distributed joins in CockroachDB, but I find the documentation/write-ups a bit lacking in this area.
Is it true that for distributed join queries in CockroachDB, the only supported algorithm is “merge join”?

The short answer is no, we also have implemented hash joins though we prefer merge joins since they are more performant. Does that answer your question, or were you interested in our behavior in a specific scenario?


#2

Thanks, thats great :slight_smile:

I have some questions about how the merge-joins are executed:

In the distributed merge-join, are rows streamed into the left-hand side of the join-operator and continuously joined in a “pipeline”-fashion?

Does CockroachDB perform a n-way merge between streams from multiple nodes when data is partitioned sparsely?


(Radu Berinde) #3

Say we join two tables, one has ranges on nodes 1 and 2, and the other has ranges on nodes 2 and 3. We have three MergeJoiner processors, one on each node. Each processor does a merge join - it reads from both streams and matches up groups of rows that match on the equality column(s). The data is distributed between these MergeJoiners by hash (a hash value between 1 and 3 in this case is calculated based on the value on the equality column(s).

Here’s an example (this is the output of EXPLAIN (DISTSQL) ...): https://goo.gl/avjAvj

We have a table numtosquare which is small and is all on node 1, and a table numtostr which is large and has pieces on all 5 nodes. Each piece of data is routed by hash to the correct merge joiner.

Note that hash joins are similarly planned.

In future releases, we plan to improve the planning process to make better decisions about how many joiners to use and on which nodes.


(Radu Berinde) #4

I didn’t answer this question:

Does CockroachDB perform a n-way merge between streams from multiple nodes when data is partitioned sparsely?

Yes, in the example above the right MergeJoiner inputs indicate that the streams are merged to maintain the ordering. The blue thingy is doing the n-way merge.