Tables: Room, Item, and UserRoom. Every item has creation time, some content and belongs to a room. Every room has one or more participants. UserRoom keeps track of which rooms user is part of. Simplified looks like this:
CREATE TABLE Room ( Identifier UUID PRIMARY KEY ); CREATE TABLE UserRoom ( UserId INT8 NOT NULL, RoomIdentifier UUID NOT NULL REFERENCES Room (Identifier), PRIMARY KEY (UserId, RoomIdentifier) ); CREATE TABLE Item ( RoomIdentifier UUID NOT NULL REFERENCES Room (Identifier), TimeOfCreation INT8 NOT NULL, Content STRING NOT NULL, PRIMARY KEY (RoomIdentifier, TimeOfCreation) );
INSERT INTO Room VALUES ('4294824e-a063-46c9-9220-99c6250720db'::UUID), ('650f28d3-53fa-471a-99ed-63a8f09bc503'::UUID), ('b6dd6203-c527-45dd-8c2c-f5ee36575f6a'::UUID); INSERT INTO UserRoom (UserId, RoomIdentifier) VALUES (1, '4294824e-a063-46c9-9220-99c6250720db'), (1, '650f28d3-53fa-471a-99ed-63a8f09bc503'), (1, 'b6dd6203-c527-45dd-8c2c-f5ee36575f6a'); INSERT INTO Item (RoomIdentifier, TimeOfCreation, Content) VALUES ('4294824e-a063-46c9-9220-99c6250720db', 1, 'Hello 1'), ('4294824e-a063-46c9-9220-99c6250720db', 2, 'Hello 2'), ('4294824e-a063-46c9-9220-99c6250720db', 3, 'Hello 3'), ('4294824e-a063-46c9-9220-99c6250720db', 4, 'Hello 4'), ('4294824e-a063-46c9-9220-99c6250720db', 5, 'Hello 5'), ('650f28d3-53fa-471a-99ed-63a8f09bc503', 1, 'Bye 1'), ('650f28d3-53fa-471a-99ed-63a8f09bc503', 2, 'Bye 2'), ('650f28d3-53fa-471a-99ed-63a8f09bc503', 3, 'Bye 3');
I want to retrieve items (and rooms) that user hasn’t seen yet. User can be added to a new room by someone else while he is offline (there will be a new entry in the UserRoom table for the user). So when he appears online there might be new empty rooms, new rooms that already have items in them, and old rooms with new content.
To determine which content is new, user, when appears online, sends a Join message to the server in which he specifies which rooms he has already seen and also every room’s last received item’s creation time.
RoomIdentifier1: LastReceivedItemTimeOfCreation RoomIdentifier2: LastReceivedItemTimeOfCreation ...
Say, user currently participates in three rooms and he was added to the first and last rooms while he was offline (so they are new to him). When he appears online the first room already has 5 items in it, while the last one is still empty. This is what the insert code snippet above does.
User has already seen the second room, so he sends to the server this:
meaning, for the room “650f28d3-53fa-471a-99ed-63a8f09bc503” the last item he received has creation time == 3.
To retrieve new items from every room and to limit retrieval to only 3 latest items per room I run this:
WITH a (RoomIdentifier, TimeOfCreation) AS ( -- cte containing values that the user supplied VALUES ('650f28d3-53fa-471a-99ed-63a8f09bc503'::UUID, 3) ) SELECT c.RoomIdentifier, d.TimeOfCreation, d.Content FROM ( SELECT -- (3) new rooms won't have a corresponding entry in "a", so replace NULL with 0 -- the result will be a table of type "RoomIdentifier" — "LatestReceivedItemTimeOfCreation" -- with values for every room (seen and not seen). Not seen rooms will have 0 in the second column b.RoomIdentifier RoomIdentifier, CASE WHEN a.TimeOfCreation IS NULL THEN 0 ELSE a.TimeOfCreation END TimeOfCreation FROM ( SELECT RoomIdentifier -- (1) retrieve all rooms that the user is part of FROM UserRoom WHERE UserId = 1 ) AS b LEFT JOIN a -- (2) there might be new rooms, so left join with "a" ON (b.RoomIdentifier = a.RoomIdentifier) ) AS c LEFT JOIN LATERAL -- (4) now can lateral join with Item table, filtering out already seen entries ( SELECT TimeOfCreation, Content FROM Item WHERE RoomIdentifier = c.RoomIdentifier AND TimeOfCreation > c.TimeOfCreation ORDER BY TimeOfCreation DESC LIMIT 3 ) AS d ON true WHERE c.TimeOfCreation = 0 OR d.Content IS NOT NULL ORDER BY c.RoomIdentifier ASC;
My question: is there a better, more performant way to do this? Item table can potentially be very big, while the number of rooms in which user participates in is rarely more than 20 or so.
This is what EXPLAIN ANALYZE says: DistSQL
I’m not sure what is local apply join. Does lateral join use indexes? Will it have to iterate over the entire Item table for every entry in “c”?
Will using row_number over partition on the Item table be better for limiting output to only 3 items?