Slow Join Query Performance Problems

sql
(Jeremy Bush ) #1

Hi there-

I’m currently using Postgres and evaluating switching my project to cockroachdb. I have existing queries in my app that are really fast in postgres, but much slower in cockroachdb. I’m using v19.1.0 released just recently.

Here’s my query:

SELECT COUNT(*) FROM "user_card_conditions" INNER JOIN "card_conditions" ON "card_conditions"."id" = "user_card_conditions"."card_condition_id" INNER JOIN "cards" ON "cards"."id" = "card_conditions"."card_id" INNER JOIN "card_sets" ON "card_sets"."id" = "cards"."card_set_id" WHERE "user_card_conditions"."user_id" = '10cd581e-18bc-4cc3-896c-e1552b8e8420' AND (cards.name ~* 'some name') AND (card_conditions.condition = 'near-mint');

This query takes about 700ms to execute on my cluster. I was running v2.1.0 and this took over 1.5s to run on that version.

Here’s my table schemas. They were imported from my existing postgres database using pg_dump:

CREATE TABLE user_card_conditions (
id UUID NOT NULL DEFAULT gen_random_uuid(),
user_id UUID NOT NULL,
card_condition_id UUID NOT NULL,
quantity INT8 NOT NULL DEFAULT 0:::INT8,
price INT8 NOT NULL DEFAULT 0:::INT8,
created_at TIMESTAMP NOT NULL,
updated_at TIMESTAMP NOT NULL,
CONSTRAINT user_card_conditions_pkey PRIMARY KEY (id ASC),
CONSTRAINT fk_rails_814558ae8b FOREIGN KEY (card_condition_id) REFERENCES card_conditions (id) ON DELETE CASCADE,
INDEX index_user_card_conditions_on_card_condition_id (card_condition_id ASC),
CONSTRAINT fk_rails_1c477dc301 FOREIGN KEY (user_id) REFERENCES users (id) ON DELETE CASCADE,
INDEX index_user_card_conditions_on_user_id (user_id ASC),
INDEX user_card_conditions_user_id_quantity_idx (user_id ASC, quantity ASC),
FAMILY "primary" (id, user_id, card_condition_id, quantity, price, created_at, updated_at)
)

CREATE TABLE card_conditions (
id UUID NOT NULL DEFAULT gen_random_uuid(),
card_id UUID NOT NULL,
condition STRING NOT NULL,
created_at TIMESTAMP NOT NULL,
updated_at TIMESTAMP NOT NULL,
CONSTRAINT card_conditions_pkey PRIMARY KEY (id ASC),
CONSTRAINT fk_rails_5fee70a3a8 FOREIGN KEY (card_id) REFERENCES cards (id) ON DELETE CASCADE,
INDEX index_card_conditions_on_card_id (card_id ASC),
FAMILY "primary" (id, card_id, condition, created_at, updated_at)
)

CREATE TABLE cards (
id UUID NOT NULL DEFAULT gen_random_uuid(),
card_set_id UUID NOT NULL,
scryfall_id STRING NOT NULL,
name STRING NOT NULL,
type_line STRING NOT NULL,
rarity STRING NOT NULL,
multiverseid INT8 NULL,
number STRING NULL,
names STRING[] NULL,
mana_cost STRING NULL,
cmc FLOAT8 NULL,
colors STRING[] NULL,
color_identity STRING[] NULL,
supertypes STRING[] NULL,
types STRING[] NULL,
subtypes STRING[] NULL,
text STRING NULL,
flavor STRING NULL,
artist STRING NULL,
power STRING NULL,
toughness STRING NULL,
loyalty INT8 NULL,
variations STRING[] NULL,
watermark STRING NULL,
border STRING NULL,
reserved BOOL NULL,
layout STRING NULL,
created_at TIMESTAMP NOT NULL,
updated_at TIMESTAMP NOT NULL,
photo STRING NULL,
collector_number STRING NULL,
image STRING NULL,
CONSTRAINT cards_pkey PRIMARY KEY (id ASC),
CONSTRAINT fk_rails_2b6dc2726c FOREIGN KEY (card_set_id) REFERENCES card_sets (id) ON DELETE CASCADE,
INDEX index_cards_on_card_set_id (card_set_id ASC),
FAMILY "primary" (id, card_set_id, scryfall_id, name, type_line, rarity, multiverseid, number, names, mana_cost, cmc, colors, color_identity, supertypes, types, subtypes, text, flavor, artist, power, toughness, loyalty, variations, watermark, border, reserved, layout, created_at, updated_at, photo, collector_number, image)
)

CREATE TABLE card_sets (
    id UUID NOT NULL DEFAULT gen_random_uuid(),
    name VARCHAR NOT NULL,
    code VARCHAR NOT NULL,
    set_type VARCHAR NOT NULL,
    released_at DATE NOT NULL,
    created_at TIMESTAMP NOT NULL,
    updated_at TIMESTAMP NOT NULL,
    CONSTRAINT card_sets_pkey PRIMARY KEY (id ASC),
    FAMILY "primary" (id, name, code, set_type, released_at, created_at, updated_at)
)

Here’s the EXPLAIN for this query:

              tree              |       field        |                   description
+-------------------------------+--------------------+--------------------------------------------------+
  group                         |                    |
   │                            | aggregate 0        | count_rows()
   │                            | scalar             |
   └── render                   |                    |
        └── hash-join           |                    |
             │                  | type               | inner
             │                  | equality           | (card_condition_id) = (id)
             │                  | right cols are key |
             ├── scan           |                    |
             │                  | table              | user_card_conditions@user_card_conditions_pkey
             │                  | spans              | ALL
             │                  | filter             | user_id = '10cd581e-18bc-4cc3-896c-e1552b8e8420'
             └── hash-join      |                    |
                  │             | type               | inner
                  │             | equality           | (card_id) = (id)
                  │             | right cols are key |
                  ├── scan      |                    |
                  │             | table              | card_conditions@card_conditions_pkey
                  │             | spans              | ALL
                  │             | filter             | condition = 'near-mint'
                  └── hash-join |                    |
                       │        | type               | inner
                       │        | equality           | (card_set_id) = (id)
                       │        | right cols are key |
                       ├── scan |                    |
                       │        | table              | cards@cards_pkey
                       │        | spans              | ALL
                       │        | filter             | name ~* 'some name'
                       └── scan |                    |
                                | table              | card_sets@card_sets_pkey
                                | spans              | ALL
(31 rows)

Time: 34.813499ms

I’ve been reading the join docs and i believe this is slow because of all the hash joins. However it says if the indexes all match up it should use a merge join. Unless I’m misunderstanding I believe that all my indexes match across the join tables.

How can I make this perform better?

(Ron Arévalo) #2

Hey @jeremybush,

Thanks for all the details. I spoke with our optimizer team, and it looks like the reason for the hash-join over the merge-join has to do with the fact that your indexes are not covering. meaning that we can merge the indexes, however some of the columns that we will need higher up the tree aren’t part of the index, so our optimizer decides to use the primary key and hash-joins. If you could provide us with the EXPLAIN(opt, env) of the query, our optimizer team may be able to provide some better feedback.

Thanks,

Ron

(Jeremy Bush ) #3

The output is too big to post here, so here’s a gist:

(Ron Arévalo) #4

Hey @jeremybush,

Thanks for this, someone from our Optimizer team will be looking into this for you.

Thanks,

Ron