MainframeSupport tip about Simple restart on output dataset

MainframeSupports
tip week 32/2008:

Many installations are still running IMS because IMS is able to position the next output operation to a dataset at the correct place after an abend followed by a restart. The typical situation is that a program is reading and updating data in DB2 and at the same time writing into a dataset. I assume that all installations ensures data integrity and concurrency for DB2 programs updating data by issuing COMMIT regularly. Unfortunately MVS has no built in feature to COMMIT datasets. This is where IMS takes over in many installations. IBM has developed a tool called RRS (Recoverable Resource Manager) which can handle COMMIT on datasets to replace IMS in this case. A few installations has developed their own tools to avoid using either IMS, RRS or other products.

For a long time I have thought about how difficult it was to make a simple restart facility for datasets instead of using various expensive products. My needs were to build a job that, if it failed, should be able to restart using the exact same JCL as it was originally submitted with. Such jobs are normally the best kind to build for execution in a production environment. My job consisted of the following components:

Reading a DB2 restart table containing information about how much processing has been performed DB2 wise, but especially information about how many records were written to the output dataset.
If no records had been written or the restart table did not contain any information about a previous breakdown I assumed that processing was to start from "the top". In this situation I deleted the old output dataset if present and allocated a new one.
In case of a restart situation I did the following:
1. RENAME of the output dataset.
2. ALLOC of a new output dataset with the same attributes as the original one.
3. Copying of the number of records indicated by the restart table from the renamed dataset to the new one.
4. DELETE of the renamed dataset.
Now I executed the processing program which always allocated the output dataset with DISP=MOD. Each time the program was about to perform a COMMIT it updated the restart table with the number of records written so far. When the program finished without any breakdowns it removed the restart information from the restart table.

This approach worked perfectly. I was just a bit annoyed about the fact that I had to write a program to perform the copying. This may be done using DFSORT or other similar products. How you will do it is entirely up to you. There are a lot of ways to do it.

There is one important detail you must be aware of. When the processing program breaks down you must try to gain control so you can close the output dataset. If you do not gain control there is a (very small) risk that MVS does not flush all records to disc and in such a situation your output dataset might end up containing fewer records than indicated by your restart table. It is therefore important that the copying procedure is able to detect such a situation and terminate further processing.

The sketched approach for restart may of course also be used for programs not using DB2. Then you have to find another way to store the information about how much processing has been performed.

Previous tip in english Sidste danske tip Tip list

MainframeSupportstip week 32/2008:

MainframeSupports
tip week 32/2008: