How does CockroachDB distinguish between equi-join and non-equi-join?

I noticed that CockroachDB use HashJoin as its default implementation in execbuilder.Builder.buildRelationalcockroach/relational.go at ab0c462672cff1e6928aba2dfc3a5429466aea39 · cockroachdb/cockroach · GitHub

And there is no such a structure like memo.HashJoinExpr or memo.CrossJoinExpr.

But when I try to run a query with non-equi-join, it will display cross join operator:

defaultdb=# create table t(a int);
CREATE TABLE
defaultdb=# create table t1(b int);
CREATE TABLE
defaultdb=# explain select * from t inner join t1 on t.a < t1.b;
          info           
-------------------------
 distribution: full
 vectorized: true
 
 • cross join
 │ pred: a < b
 │
 ├── • scan
 │     missing stats
 │     table: t@primary
 │     spans: FULL SCAN
 │
 └── • scan
       missing stats
       table: t1@primary
       spans: FULL SCAN
(15 rows)

I found the name cross join is produced by joinNode, which is computed with number of equality join conditions in the joinNode.

Does that mean the join type can only be determined as soon as the RelExpr has been built to execPlan? If so, what’s the benefit of this design?

The optimizer will explore other join types before building the chosen query plan into an execPlan. For example, merge joins and lookup joins (for the latter I’m using a lookup join hint to force the optimizer to pick a plan with a lookup join):

defaultdb> CREATE TABLE t1 (a INT, INDEX (a));
CREATE TABLE

defaultdb> CREATE TABLE t2 (b INT, INDEX (b));
CREATE TABLE

defaultdb> EXPLAIN SELECT * FROM t1 INNER JOIN t2 ON t1.a = t2.b;
            info
----------------------------
  distribution: full
  vectorized: true

  • merge join
  │ equality: (a) = (b)
  │
  ├── • scan
  │     missing stats
  │     table: t1@t1_a_idx
  │     spans: FULL SCAN
  │
  └── • scan
        missing stats
        table: t2@t2_b_idx
        spans: FULL SCAN
(15 rows)

defaultdb> EXPLAIN SELECT * FROM t1 INNER LOOKUP JOIN t2 ON t1.a = t2.b;
           info
---------------------------
  distribution: full
  vectorized: true

  • lookup join
  │ table: t2@t2_b_idx
  │ equality: (a) = (b)
  │
  └── • scan
        missing stats
        table: t1@primary
        spans: FULL SCAN
(11 rows)

I got that.

Correct me if I’m wrong. In CockroachDB, a InnerJoin operator can be one of InnerNonApplyJoin(hash join or cross join), MergeJoin, LookupJoin, since you don’t distinguish between logical operators and physical operators. In this case, hash join and cross join are the same variant of RelExpr, so they will have the same cost estimation scheme. The concrete implementation of a InnerNonApplyJoin can only be determined with predicate type in execPlan.