MainframeSupport tip about SQL MERGE statement performance

MainframeSupports
tip week 34/2016:

One of the newest additions to the SQL language is the MERGE statement which is a combination of INSERT and UPDATE. I do not have much experience in using MERGE, but as a DBA you stumble across many different things especially fitfalls associated with the use of new statements. First of all you need to get familiar with the syntax. Then you will explore how it works and last thing is to pick up an interest in how efficient it performs. And it was the performance part that made me write this tip.

I will not go into details about the syntax and the way it works. You can read about it in SQL reference MERGE. However there is a tiny but important detail about performance that has been forgotten. I will try to explain the issue here. Imagine a table called MYTABLE having the columns KEYCOL (an integer) and DATACOL (a char/varchar). There is an index on column KEYCOL. Now I execute the following MERGE statement:

MERGE INTO MYTABLE USING (VALUES (1,'The first row')) AS X (A, B) ON (KEYCOL = A) WHEN MATCHED THEN UPDATE SET DATACOL = X.B WHEN NOT MATCHED THEN INSERT (KEYCOL, DATACOL) VALUES (A, B)

Please note the USING part which defines two variables called A and B. A is assigned the value 1 and B the value 'The first row'. In the the VALUES part you can of course use host variables instead of constants. After the USING definition you are free to use A and B where ever you like in the remaining part of the statement. Of course you have to use the prefix X if there are any name collision with columns in the table. ON describes the merge criteria. If the merge criteria is fulfilled the UPDATE part is executed otherwise the INSERT part is executed. In the example it means that if a row exists in MYTABLE with KEYCOL = 1 then DATACOL is overwritten with the value 'The first row' or else a new row is inserted with the vaues of the two variables A and B.

In the above example DB2 will use the index on KEYCOL to determine whether an INSERT or an UPDATE is to be performed. But what happens if the statement looks like this:

MERGE INTO MYTABLE USING (VALUES (2,'The second row', 1)) AS X (A, B, C) ON (KEYCOL = A) WHEN MATCHED THEN UPDATE SET DATACOL = X.B, KEYCOL = X.C WHEN NOT MATCHED THEN INSERT (KEYCOL, DATACOL) VALUES (A, B)

This merge statement can be used to update both KEYCOL and DATACOL. This is pretty flexible, but has the disadvantage of preventing DB2 from using the index on KEYCOL. Instead DB2 carries out a tablespace scan to determine whether the insert part or the update part is to be executed. The simple advice is thus to avoid updating columns that are part of an index that could be exploited by your ON criteria. If you do not take this into consideration MERGE will use a tablespace scan.

Previous tip in english Forrige danske tip Tip list

MainframeSupportstip week 34/2016:

MainframeSupports
tip week 34/2016: