MainframeSupports
tip week 39/2015:

This tip is a goodie for all of you that love SQL. In DB2 10 the so-called OLAP functions were extended heavily with possibilities for sums and totals and rolling averages and a lot of similar stuff. I was introduced to these new functions when DB2 10 was announced, but at that time I just thought that it was not something I would need to use.

One of the things that always irritates me when I make a GROUP BY with some COUNT(*) is the lacking ability to create a total of all the COUNT's. Then I came to think of the new OLAP functions. They were indeed suited to help me, but in order to acheive the desired result I had to give up using GROUP BY. That was a little strange, but the reason is that OLAP functions operates on the result after GROUP BY is applied. When you have fully grasped this concept you will be able to create some wildly crazy SQL statements. Just to illustrate the concept I will show you a less crazy statement. It will work on your installation, but it will most likely not return exactly the same result:

SELECT DISTINCT dbname
     , SUM(colcount) OVER (PARTITION BY dbname)
     , SUM(colcount) OVER (ORDER BY dbname)
     , SUM(1) OVER (PARTITION BY dbname)
     , SUM(1) OVER (ORDER BY dbname)
FROM sysibm.systables
WHERE creator = 'SYSIBM'
  and type = 'T'
ORDER BY dbname

The first peculiar thing to notice is the use of DISTINCT which replaces the GROUP BY. The next peculiar thing is the expressions used after (in this example) the SUM function. OVER specifies that an OLAP function is carried out on the column result. In the following parenthesis the OLAP function is specified. PARTITIONED BY corresponds to GROUP BY while ORDER BY works on the total result. SUM(1) replaces COUNT(*) and ensures that each row is counted as one row before DISTINCT is carried out. Please notice that the column or columns listed after PARTITION BY or ORDER BY does not have to be the same for each column in the result and it is exactly here it can get a little crazy. In this example I have listed the same column even in the ORDER BY at the end of the SQL statement in order to ensure that the result looks like the result after a GROUP BY. And the result may look like this:

DBNAME      COLCOUNT    COLTOTAL  TABLECOUNT  TABLETOTAL
--------+-----------+-----------+-----------+-----------
DSNADMDB           6           6           2           2
DSNATPDB          31          37           2           4
DSNDB01           59          96           3           7
DSNDB06         1982        2078         116         123
DSNDPSM           17        2095           2         125
DSNMQDB           30        2125           2         127
DSNOPTDB         549        2674          21         148
DSNRLST           20        2694           2         150
DSNRTSDB          62        2756           2         152
DSNXSR            43        2799           4         156

The COLCOUNT column contains the number of columns on all SYSIBM tables in the corresponding database. The COLTOTAL column contains a running sum of the number of columns which means that the last row contains the total number of columns on all SYSIBM tables. TABLECOUNT contains the number of SYSIBM tables for each database while TABLETOTAL is the running sum of the count of tables and the last row thus contains the total number of SYSIBM tables. Without using OLAP function I had to use four SQL statements to acheive the same result. Wow, that is truly awesome.

If you wonder why I have not included a link to the SQL reference description of OLAP functions it is because IBM no longer publishes their DB2 manuals in Book Manager format on the internet. IBM had instead introduced something called Information Center which was beyond hopeless. IBM realised this and introduced something called Knowledge Center which is almost just as bad.

Previous tip in english        Forrige danske tip        Tip list