MainframeSupports
tip week 38/2008:

Some time ago I wrote a tip about optimizing an OUTER JOIN. Later on that tip gave me an idea which I want to share with you in this tip of the week. The idea is to rewrite existing programs containing a specific IF structure into an OUTER JOIN. The main structure in such programs is a SQL statement finding all the rows to be processed. For some of these rows additional information has to be retrieved in one or more other tables. It may look somthing like this:

EXEC SQL DECLARE MAINCURSOR CURSOR FOR
  SELECT COLUMNA, ..., COLUMNX
  FROM MAIN_TABLE;
EXEC SQL OPEN MAINCURSOR;
EXEC SQL FETCH MAINCURSOR ...;
WHILE SQLCODE = 0 DO
  EXTRA = ''
  IF MAINCURSOR.COLUMNX = 'MORE INFO NEEDED'
  THEN DO
    EXEC SQL
      SELECT EXTRA INTO :EXTRA
      FROM MORE_INFO_TABLE
      WHERE KEYCOLUMN = :MAINCURSOR.COLUMNA;
  END
  /* DO MORE WITH ALL GATHERED INFORMATION */
  EXEC SQL FETCH MAINCURSOR ...;
END
EXEC SQL CLOSE MAINCURSOR;

The above program structure may be simplified into:

EXEC SQL DECLARE MAINCURSOR CURSOR FOR
  SELECT MAIN.COLUMNA, ..., VALUE(MORE.EXTRA, '')
  FROM MAIN_TABLE MAIN
  LEFT OUTER JOIN MORE_INFO_TABLE MORE
  ON MAIN.COLUMNX = 'MORE INFO NEEDED'
  AND MAIN.COLUMNA = MORE.KEYCOLUMN;
EXEC SQL OPEN MAINCURSOR;
EXEC SQL FETCH MAINCURSOR ...;
WHILE SQLCODE = 0 DO
  /* DO MORE WITH ALL GATHERED INFORMATION */
  EXEC SQL FETCH MAINCURSOR ...;
END
EXEC SQL CLOSE MAINCURSOR;

The IF statement has disappeared and the MAINCURSOR has been transformed into an OUTER JOIN. The result of executing these two pieces of code is exactly the same. All that is left is to discuss the beaty of the two pieces of code and which code performs the best. In my view there is no doubt that the OUTER JOIN example gives the simplest code even though MAINCURSOR is a bit more complicated to understand.

Far more interesting is the performance of the two examples. In both examples MAINCURSOR will return the same amount of rows. The number of executed FETCH statements is the same. In both cases COLUMNX will be examined for all rows and the number of accesses in table MORE_INFO_TABLE will also be the same. The savings in the OUTER JOIN example are the ressources related to execute the extra SQL SELECT statement in the first example. If the extra SQL statement has to be performed a lot of times the savings are significant, but it is not worthwhile to rewrite the program if less than 10% of the rows fetched in the MAINCURSOR results in an extra SQL SELECT.

I have made some measurements and the idea introduced here does not yield big performance savings as you sometimes may experience with DB2. I experienced savings of 10-30% by using OUTER JOIN instead of IF depending on the number of saved SQL SELECT statement executions. In version 4 and sometimes in version 5 of DB2 the program rewrite would result in very bad performance. I will only recommend the idea if you are running version 6 or later, but who is not nowadays.

Previous tip in english        Sidste danske tip        Tip list