Preferences

SQL MERGE looks great! I hope I remember it when the time comes, instead of writing 3 separate queries.

edit: Postgres docs on MERGE: https://www.postgresql.org/docs/15/sql-merge.html


Hmmm. The doc kinda suggests that this might be more efficient than doing it with separate commands:

"First, the MERGE command performs a join from data_source to target_table_name producing zero or more candidate change rows. For each candidate change row, the status of MATCHED or NOT MATCHED is set just once, after which WHEN clauses are evaluated in the order specified. For each candidate change row, the first clause to evaluate as true is executed."

Anybody know more about this?

From lots of experience with SQL Server, I know that over there, MERGE is not more efficient, it's just syntactic sugar - and in fact it's buggy syntactic sugar, as there are some conditions where it doesn't handle concurrency properly.

Section 13.2. Transaction Isolation" has some additional information regarding the behavior of MERGE. Just CTRL-F and search for "MERGE" on https://www.postgresql.org/docs/15/transaction-iso.html#XACT...
I'm actually surprised to hear that MERGE is only now available on Postgres.

I'm now interesting in hearing about other standard (what I have come to expect as standard) SQL that's not or only now available on Postgres?

Well MS SQL Merge statement is not very good, and I personally avoid it, and most places I worked in recommend to avoid it, except in the simplest scenarios

From the docs "At scale, MERGE may introduce complicated concurrency issues or require advanced troubleshooting. As such, plan to thoroughly test any MERGE statement before deploying to production."

I dont know if its better in PQSQL , but they took their time, so maybe it is

> Well MS SQL Merge statement is not very good

That is rather an understatement. IIRC unless you are careful with transaction isolation levels and other options, a single MERGE statement can still deadlock against itself.

See https://michaeljswart.com/2021/08/what-to-avoid-if-you-want-... and the older page it links to, mongst many references to the problems you can encounter.

Though things are a fair amount better than they used to be, back in 2008 when MERGE was first introduced, with most of the persistent buggy behaviours pertaining to specific features that not everyone uses, I still completely avoid it.

Postgres’ history usually suggests they don’t ship broken database features which is why most of us reach for it as a first option when choosing a database. The MSSQL warning sounds bad enough that I’d never use this feature!
Well, the Postgres docs about MERGE include a similar warning: "When MERGE is run concurrently with other commands that modify the target table, the usual transaction isolation rules apply; see [Concurrency control / Transaction isolation] for an explanation on the behavior at each isolation level. You may also wish to consider using INSERT ... ON CONFLICT as an alternative statement which offers the ability to run an UPDATE if a concurrent INSERT occurs. There are a variety of differences and restrictions between the two statement types and they are not interchangeable." https://www.postgresql.org/docs/current/sql-merge.html
That's not a warning, instead, it's how it should work, and how one would want it to work, i.e. that the transaction isolation rules apply. Lack of this, would have warranted a warning.
Agreed. The amount of deadlocks merged causes in MS SQL is pretty insane.

We tried to use it for "upsert" type capabilities and even that would cause weird deadlocks.

Postgres already has the `Insert foo on conflict do update` type syntax which I think is generally better.

> Postgres already has the `Insert foo on conflict do update` type syntax which I think is generally better.

INSERT... ON CONFLICT is awesome, but it has some limitations.

The one I ran into most commonly is that it can only handle exactly one unique constraint on the target table. So if you have both a PK and another unique index, you need to choose which one gets the simple 'on conflict' and which one gets a hacky workaround (locks/transactions, triggers, exception handling, etc.)

If I'm reading the MERGE docs right, you can handle that case:

   WHEN MATCHED AND old.pkey = new.pkey THEN UPDATE SET value = new.value
   WHEN MATCHED AND old.col1 = new.col1 AND old.col2 = new.col2 THEN UPDATE SET reps = reps + 1
   WHEN NOT MATCHED THEN INSERT [...]
MERGE does not have locking built in in the way that ON CONFLICT does, so it handles more cases and fewer cases all at the same time.
Agreed, another vote here from painful experience to avoid MERGE in SQL Server like the plague.
Is that a consequence of MSSQL using row locks rather than MVCC by default? Or is it still the same even when you enable snapshot transactions?
It kind of did support it before. You could do an INSERT … ON CONFLICT ( keys here) DO UPDATE update query here
This only works if you reference a combination of columns that has been declared as unique which is not always something you want or can do.
Merge has a few more capabilities beyond that, but I think that's where you'd want to use it the most.
The reason is that PostgreSQL has INSERT ... ON CONFLICT which is usually what you want, especially since it handles concurrency in the way you usually want. MERGE has more capabilities but not enough of them to make such a complex feature prioritized.
At least for some cases, there was a workaround by using INSERT ... ON CONFLICT
I would argue that INSERT ... ON CONFLICT is not just a workaround but the correct solution in most cases. It is very explicit about what you want and makes sure that either it can take the correct locks or it will error out if there is no unique index/primary key that it can use to take the lock. But, yes, MERGE can do more things than INSERT ... ON CONFLICT.
I'm currently neck deep in a decent sized oracle to postgres project, and MERGE INTO saved me many many hours

This item has no comments currently.