MainframeSupport tip about Easy lookup of data in PL/I

MainframeSupports
tip week 51/2019:

It is a long time ago I have written a tip about programming. I mostly use REXX and REXX has a fantastic invention called stems. The basic idea of stems has been implemented in many modern programming languages where the concept is known as associative arrays. In practise this means that instead of using a numeric value as index in a table you can use a name or word. This concept has not yet found its way to PL/I (or COBOL), but I guess it will come.

I have used an idea a little similar to associative arrays in PL/I. My first implementation is to solve problems where I need to determine whether I have used or met the same word or name before. You could call it a check for duplicates. Such code takes up very few lines of code in REXX using a stem and it is extremely easy to make it work. I have trie to acheive the same in the following example:

plipgm: PROC(parm) OPTIONS(MAIN) ORDER;

  DCL parm            CHAR(100) VAR;

  DCL dataCache       CHAR(32767) VAR;
  DCL dataSize        FIXED BIN(31);
  DCL sysprint        FILE PRINT;
  DCL sysin           FILE RECORD SEQUENTIAL;
  DCL sysin_data      CHAR(80) VAR;
  DCL sysin_eof       CHAR(1);
  DCL startPos        FIXED BIN(31);
  DCL wordPos         FIXED BIN(31);

  sysin_eof = 'N';
  ON ENDFILE(sysin) sysin_eof = 'Y';

  dataSize = stg(dataCache) - 2;
  dataCache = 'FF'X;

  OPEN FILE(sysin);
  READ FILE(sysin) INTO(sysin_data);
  DO WHILE(sysin_eof = 'N');
    sysin_data = trim(sysin_data);
    IF index(dataCache, 'FF'X !! sysin_data !! 'FF'X) = 0
    THEN
      IF length(dataCache) + length(sysin_data) < dataSize
      THEN
        dataCache = dataCache !! sysin_data !! 'FF'X;
    READ FILE(sysin) INTO(sysin_data);
  END;
  CLOSE FILE(sysin);

  startPos = 2;
  wordPos = index(dataCache, 'FF'X, startPos);
  DO WHILE(wordPos > 0);
    PUT SKIP LIST(substr(dataCache, StartPos, wordPos - startPos));
    startPos = wordPos + 1;
    wordPos = index(dataCache, 'FF'X, startPos);
  END;

END plipgm;

The program reads all records from SYSIN and prints all unique records found on SYSIN stripped for leading and trailing blanks. Thus duplicates are removed. The unique records are stored in a character string of variable length with up to 32767 characters. Each unique record is separated by the hex vaule FF. If this value occurs in SYSIN the program will not work properly.

Instead of a traditional solution using an array I use a character string as mentioned earlier. To find elements in the character string I use the INDEX function. This solution is far more flexible than an array where I would have to limit the number of elements as well as the length of each individual element.

The only disadvantage is the internal limit in PL/I of 32767 characters on a character string. When the length of all unique records passes 32767, new unique records will simply be ignored. Thus it is important to evaluate whether this limit will be reached when using the above solution in another context. And if the limit is reached, what then? And the separator (in my example hex FF) must not be part of data.

I think the above is a very simple solution to code as the INDEX function does all the hard work. And another benefit on the mainframe by using INDEX is that there is a matching machine instruction which makes the code perform extremely fast (provided the PL/I compiler makes an optimal translation).

Previous tip in english Forrige danske tip Tip list

MainframeSupportstip week 51/2019:

MainframeSupports
tip week 51/2019: