Top N entries from every group – Lateral join with a very large right table

Tables: Room, Item, and UserRoom. Every item has creation time, some content and belongs to a room. Every room has one or more participants. UserRoom keeps track of which rooms user is part of. Simplified looks like this:

CREATE TABLE Room (
    Identifier UUID PRIMARY KEY
);

CREATE TABLE UserRoom (
    UserId INT8 NOT NULL,
    RoomIdentifier UUID NOT NULL REFERENCES Room (Identifier),
    PRIMARY KEY (UserId, RoomIdentifier)
);

CREATE TABLE Item (
    RoomIdentifier UUID NOT NULL REFERENCES Room (Identifier),
    TimeOfCreation INT8 NOT NULL,
    Content STRING NOT NULL,
    PRIMARY KEY (RoomIdentifier, TimeOfCreation)
);

Dummy content:

INSERT INTO Room
VALUES
    ('4294824e-a063-46c9-9220-99c6250720db'::UUID),
    ('650f28d3-53fa-471a-99ed-63a8f09bc503'::UUID),
    ('b6dd6203-c527-45dd-8c2c-f5ee36575f6a'::UUID);

INSERT INTO UserRoom (UserId, RoomIdentifier)
VALUES
    (1, '4294824e-a063-46c9-9220-99c6250720db'),
    (1, '650f28d3-53fa-471a-99ed-63a8f09bc503'),
    (1, 'b6dd6203-c527-45dd-8c2c-f5ee36575f6a');

INSERT INTO Item (RoomIdentifier, TimeOfCreation, Content)
VALUES
    ('4294824e-a063-46c9-9220-99c6250720db', 1, 'Hello 1'),
    ('4294824e-a063-46c9-9220-99c6250720db', 2, 'Hello 2'),
    ('4294824e-a063-46c9-9220-99c6250720db', 3, 'Hello 3'),
    ('4294824e-a063-46c9-9220-99c6250720db', 4, 'Hello 4'),
    ('4294824e-a063-46c9-9220-99c6250720db', 5, 'Hello 5'),
    
    ('650f28d3-53fa-471a-99ed-63a8f09bc503', 1, 'Bye 1'),
    ('650f28d3-53fa-471a-99ed-63a8f09bc503', 2, 'Bye 2'),
    ('650f28d3-53fa-471a-99ed-63a8f09bc503', 3, 'Bye 3');

I want to retrieve items (and rooms) that user hasn’t seen yet. User can be added to a new room by someone else while he is offline (there will be a new entry in the UserRoom table for the user). So when he appears online there might be new empty rooms, new rooms that already have items in them, and old rooms with new content.

To determine which content is new, user, when appears online, sends a Join message to the server in which he specifies which rooms he has already seen and also every room’s last received item’s creation time.

RoomIdentifier1: LastReceivedItemTimeOfCreation
RoomIdentifier2: LastReceivedItemTimeOfCreation
...

Say, user currently participates in three rooms and he was added to the first and last rooms while he was offline (so they are new to him). When he appears online the first room already has 5 items in it, while the last one is still empty. This is what the insert code snippet above does.

User has already seen the second room, so he sends to the server this:

"650f28d3-53fa-471a-99ed-63a8f09bc503": 3

meaning, for the room “650f28d3-53fa-471a-99ed-63a8f09bc503” the last item he received has creation time == 3.

To retrieve new items from every room and to limit retrieval to only 3 latest items per room I run this:

WITH a (RoomIdentifier, TimeOfCreation) AS ( -- cte containing values that the user supplied
    VALUES
        ('650f28d3-53fa-471a-99ed-63a8f09bc503'::UUID, 3)
)

SELECT
    c.RoomIdentifier,
    d.TimeOfCreation,
    d.Content
FROM (
    SELECT -- (3) new rooms won't have a corresponding entry in "a", so replace NULL with 0
           -- the result will be a table of type "RoomIdentifier" — "LatestReceivedItemTimeOfCreation"
           -- with values for every room (seen and not seen). Not seen rooms will have 0 in the second column
        b.RoomIdentifier RoomIdentifier,
        CASE
            WHEN a.TimeOfCreation IS NULL THEN 0
            ELSE a.TimeOfCreation
        END TimeOfCreation
    FROM (
        SELECT RoomIdentifier -- (1) retrieve all rooms that the user is part of
        FROM UserRoom
        WHERE UserId = 1
    ) AS b LEFT JOIN a -- (2) there might be new rooms, so left join with "a"
        ON (b.RoomIdentifier = a.RoomIdentifier)
) AS c LEFT JOIN LATERAL -- (4) now can lateral join with Item table, filtering out already seen entries
    (
        SELECT TimeOfCreation, Content
        FROM Item
        WHERE RoomIdentifier = c.RoomIdentifier AND TimeOfCreation > c.TimeOfCreation
        ORDER BY TimeOfCreation DESC
        LIMIT 3
    ) AS d
    ON true
WHERE c.TimeOfCreation = 0 OR d.Content IS NOT NULL
ORDER BY c.RoomIdentifier ASC;

My question: is there a better, more performant way to do this? Item table can potentially be very big, while the number of rooms in which user participates in is rarely more than 20 or so.

This is what EXPLAIN ANALYZE says: DistSQL

I’m not sure what is local apply join. Does lateral join use indexes? Will it have to iterate over the entire Item table for every entry in “c”?

Will using row_number over partition on the Item table be better for limiting output to only 3 items?

Hi Arctur! Welcome to the forum.

In our docs, we do have some guidance around JOIN expressions | CockroachDB Docs performance which may answer some of your questions. You should be able to verify that indexes are used by looking at the EXPLAIN output.

1 Like

Thanks for the response, but the EXPLAIN output doesn’t actually say anything it this case. There is a link to the EXPLAIN ANALYZE output in the first post. For the first join it says that Hash join and index are used, but for the second join it just says “apply local join” and that’s it. And this apply join has only one arrow coming in instead of two like other joins.

I don’t have a concrete suggestion for what you should change, but I do have some more information to share.

Apply joins are very inefficient. We plan apply joins when it’s not possible to decorrelate a lateral join, so we need to basically plan and run a subquery for each row in the left side of the join.

You can try to rewrite the query so that we can effectively decorrelate it and plan a regular join, using EXPLAIN to verify. Eliminating the LATERAL JOIN altogether should do it, but that may be difficult.

We think replacing the limit may solve the problem. In order to decorrelate the query, we need to pull the correlated filter above the lateral join, but it isn’t valid to pull the filter above the limit. Your idea to use row_number to replace the limit is worth a try, so long as it takes place above the lateral join.