UPSERT that worked in 2.1.5 and fails in 19.1.2

I have a table like (pardon any errors, translating from internal names):

CREATE TABLE foo_bar (
    foo_id STRING(1000),
    bar_id STRING(1000),
    PRIMARY KEY (foo_id, bar_id),
    INDEX (bar_id)
);

In 2.1.5, this query would work:

INSERT INTO foo_bar (foo_id, bar_id) SELECT UNNEST('{"fooval1","fooval2","fooval2","fooval3"}'::STRING[]), 'barval1' ON CONFLICT (foo_id, bar_id) DO NOTHING;

But in 19.1.2, I get:

pq: duplicate key value (foo_id,bar_id)=('fooval2','barval1') violates unique constraint "primary"

Was that a result of this issue or something else? Is this backwards-incompatible change intended? Is this unique to this insert-from-select or select-unnest approach? The topic here (followup) was answered recently suggesting an UPSERT should work fine in these cases.

Any info is appreciated, thanks.

Hey @chad.retz,

Sorry for the delay here. Regarding your question, DO NOTHING in 19.1 only affects conflicts with rows written before the transaction started. It does not affect conflicts between rows written during the transaction, as in your example. Our current guidance is to use DISTINCT to filter out duplicate values in your input query if you can’t guarantee there won’t be any.

Here is an example of that:

WITH
    -- the following data contains duplicates on the conflict column "id":
    inputrows AS (VALUES (8, 130), (8, 140))

  INSERT INTO accounts (id, balance)
    (SELECT DISTINCT ON(id) id, balance FROM inputrows) -- de-duplicate the input rows
    ON CONFLICT (id)
    DO NOTHING;

Let me know if you have any questions.

Thanks,

Ron

1 Like

Thanks for the update! A few quick questions.

First, does this deviate from Postgres?

Second, if so and not already documented, is it worth adding since it can seem kinda surprising that you can’t always avoid conflicts on insert? EDIT: Ah, I consider the docs pointed to here as good enough, I just missed them.

Finally, what would y’all’s guidance be if I wanted to do something different on update instead of do nothing? I’m assuming the answer would be “can’t do conditional row changes on duplicate insert values, must preprocess” (something that hurts without temp tables)

Hey @chad.retz,

In the case of DO NOTHING the answer is yes, we do deviate from Postgres.

I’m not sure if this is documented anywhere, I’ll ask around, and if it isn’t we’re more than happy to refer this to our docs team to make a note somewhere.

Lastly, you may wish to read this on how we propose handling duplicates when importing data.

Regards,

Ron

1 Like