
MainframeSupports tip week 14/2003:
Despite all our efforts it sometimes happens that dirty data enters our DB2 tables. On
these occations the problem often is that two or more rows has got the same "keyvalue" even
though they weren't supposed to. In these cases it can be rather difficult to track down the
error that caused the problem and also to identify the duplicates and decide which rows
to remove.
You can use the following SQL statement as a model for how to locate rows with the
same "keyvalue" and show you the full contens of these rows. You can even control the
acceptable number of duplicates for a given "keyvalue". The example will show you all
rows in SYSIBM.SYSTABLES where the table name is the same for more than one row.
SELECT tab.*
FROM sysibm.systables tab
, ( SELECT name
FROM sysibm.systables
GROUP BY name
HAVING COUNT(*) > 1
) keyvalue
WHERE tab.name = keyvalue.name
ORDER BY tab.creator, tab.name
The trick is to create a "temporary table" with all the interesting "keyvalues" and then
join these "keyvalues" with the table containing the duplicates. In most cases you will have
a "keyvalue" consisting of more than one column, but the method is the same. The "temporary
table" called keyvalue in the example, must contain all the columns that forms the "keyvalue".
The same columns must be present in the GROUP BY clause. You must also remember to join all
the columns in the "temporary table" with the columns in the real table.
The HAVING COUNT(*) > 1 clause ensures that only rows with more than one "keyvalue" will
be selected from the real table. If your interest are in rows without duplicate "keyvalues"
you can use HAVING COUNT(*) = 1 instead. Or maybe other rules governs your problematic rows.
Now you have seen an easy way of identifying them.
Previous tip in english
Sidste danske tip
Tip list
|