Apr 27, 2017

Checklist for data migration using Redirected Recovery


In many of my previous blogs I mentioned that we (DBA team) use Redirected Recovery of Recovery Expert for z/OS v3.2 for data migrations from one DB2 sub-system to another DB2 sub-system. The data migration (using Redirected Recovery) is done in 3 phases.



      Phase 1: Create Profiles and update Recovery options in Recovery Expert
This is pretty straight-forward process. We add all the Tablespaces in a database needed for data migration to an Application profile. Usually, nothing can go wrong in this phase.


      Phase 2: Execute Build JCL (that builds/generates the data migration JCLs)
In the Build JCL, we provide the following details to build/generate the data migration JCLs.
       a)      Application profile name (having the objects to be migrated)
       b)     Timestamp for PIT recovery 
             (Redirected Recovery can do only PIT recovery to another sub-system)
       c)      PDS name (data migration JCLs will be created in this PDS)
       d)     Plan number for the Recovery (its 29 for Redirected Recovery)

This job can fail for many reasons and the frustrating part is that it’s not easy to find out the reason for job failure. Most of the time we see just the ARYZ018E error message, with list of Objects that can’t be recovered. And, reason why the Objects are recoverable will be not be provided in the job output.

After discussions with Rocket Software tool support team, got the checklist to go through for Build job to be successful. The checklist provides the details of Objects that can lead to failure of Build job. So, these Objects will be excluded from data migration process. With the help of JCL and REXX, I automated the checklist. The JCL will have multiple steps to verify the below.

1. Check whether all TS’ in the Source database has ICTYPE=’F’ entries in SYSCOPY for a given ICDATE. I use this SQL query for verification.
SELECT
   DBNAME
  ,NAME
FROM SYSIBM.SYSTABLESPACE
WHERE
   DBNAME=<dbname> AND
   NAME NOT IN
              (SELECT
                 TSNAME
               FROM SYSIBM.SYSCOPY
               WHERE
                  DBNAME=<dbname> AND
                  ICDATE=<date>)
WITH UR;

2. Check whether all Image copy datasets (of the TS’ got from the above SQL query) are present in ICF catalog. I coded the below REXX routine for this verification.
LIST_PARA:                                                                     
 SAY '*****SYSCOPY ENTRIES NOT IN ICF CATALOG*****'                            
 DO I = 1 TO INP.0                                                              
   IC_DSN = STRIP(INP.I,"B")                                                   
   X = OUTTRAP(LSTC.)                                                          
    ADDRESS TSO "LISTCAT ENT('"IC_DSN"') "                                     
   X=OUTTRAP('OFF')                                                            
   Y = STRIP(WORD(LSTC.1,1),"B")                                               
   IF Y <> 'NONVSAM'                                                            
     THEN SAY IC_DSN NOT FOUND IN ICF CATALOG                                  
 END                                                                           
 SAY '********************************************'                             
RETURN;

3. Check whether any Objects (of TS’ included in the Application profile) are altered.
The CURRENT_VERSION of the TS can change for many reasons. Redirected Recovery can’t migrate data from versioned TS to non-versioned TS. I coded Objects’ structures comparison process to verify this.
Versioning can also be found using the below query.
SELECT
   TS.DBNAME
  ,TS.NAME
  ,TB.NAME
  ,TS.OLDEST_VERSION
  ,TS.CURRENT_VERSION
  ,TB.VERSION
FROM SYSIBM.SYSTABLESPACE TS
     INNER JOIN
SYSIBM.SYSTABLES TB ON
   TS.DBNAME = TB.CREATOR AND
   TS.DBNAME = TB.DBNAME  AND
   TS.NAME = TB.TSNAME
WHERE
   TS.DBNAME='dbname'  AND
   TB.TYPE = 'T'       AND
   TS.CURRENT_VERSION <> TB.VERSION;

4. PIT timestamp used in Build job must be greater than the highest START_RBA for a given DBNAME and ICDATE. I use this SQL query to get the value.
SELECT TIMESTAMP((MAX(START_RBA))||X'0001') + CURRENT TIMEZONE
FROM SYSIBM.SYSCOPY
WHERE
   DBNAME=<DBNAME> AND
   ICDATE=<DATE>
WITH UR;


         Phase 3: Execution of the data migration JCLs
This phase can go smooth if we follow the checklist.

Improvements to DB2 objects' structure comparison


In my previous blog I mentioned how and why I developed an alternate process to Object Capture using DB2 Admin tool. After some discussion and based on feedback from my team, I made some improvements to this process.


               1. Source and Target comparison
The old version of the process compared Source with Target, but not vice-versa. And, the job has to be submitted again with Source and Target names swapped.There can some objects present in Target but not in Source. So, these were getting missed. 
          
              Improvement: The new version of the process compares Source with Target and vice-versa.


               2. Tablespaces and Columns comparison
The old version of the process just tried to find whether the Tablespace and Columns existed in Target or not (same is done for Tables and Indexes). This may not be enough in our ERP environment as we can have versioned Tablespaces and Columns in different sequence (due to something like patches applied in DEV but not yet in QA and Prod).

            Improvement: The new version of the process compares the following for Tablespaces and Columns.
                               For Tablespaces:
                               a. Name
                               b. OLDEST_VERSION and CURRENT_VERSION
                               c. Type (UTS or non-UTS)

                               For Columns
                               a. Name
                               b. Sequence number
                               c. Data type
                               d. Length and Scale
                               e. Nulls


                3. Storing the structures comparison report for historical purpose
The old version of the process wrote the comparison report to a Sequential dataset. Referring the comparison report will be helpful (and needed) in our ERP world whenever a new upgrade or patch is applied. But, it is cumbersome to refer the datasets for historical purposes.

            Improvement: The new version of the process loads the comparison report into a DB2 table. It will be easy to query a DB2 Table to get the comparison details and is a better approach for storing the reports for historical purposes. The DB2 table has the following columns.
                                                     i.          Source DB
                                                     ii.         Target DB
                                                     iii.        Object type (TS/TB/IX/CL)
                                                     iv.        Object name
                                                     v.         Table name (only for columns)
                                                     vi.        Present in Source (Y/N) – for TS/TB/IX/CL
                                                     vii.       Present in Target (Y/N) – for TS/TB/IX/CL
                                                     viii.      Type mismatch (Y/N) – only for TS
                                                     ix.        Version mismatch (Y/N) – only for TS
                                                     x.         Sequence mismatch (Y/N) – only for cols
                                                     xi.        Data type mismatch (Y/N) – only for cols
                                                     xii.       Length and Scale mismatch (Y/N) – only for cols
                                                     xiii.      Nulls mismatch (Y/N) – only for cols
                                                     xiv.      Row entry time stamp

Apr 8, 2017

DB2 objects' structure comparison using ISRSUPC


My team does frequent data migrations across DB2 sub-systems, for ERP application. We use Redirected Recovery component of Recovery Expert for z/OS for data migrations. Its an acceptable situation where the objects' structures in source sub-system will not be in sync with that of target sub-system. But, data migration jobs will fail if we try to migrate data for not-in-sync objects.

Recovery Expert doesn't have in-built object structure comparison capability. So, we use DB2 Object Comparison Tool present in DB2 Administration Tool menu for comparing the objects' structures before data migration. The activities involved in generating the report of mismatches using the Tool are
                - Specify compare source (new)
                - Specify compare target (old)
                - Specify compare masks       
                - Specify ignores             
                - Generate compare job           

The Tool performs fine. But, I didn't like going through this whole process in DB2 Object Comparison Tool and felt that its complicated and time consuming. I want to get the same results in less time and with less prep work. So, I tried and built a new process for Objects' structure comparison using REXX.

The valid question is , do we really need an alternate process when there is a sophisticated Tool available already. My answer is, to keep it simple *GRIN*



The first hurdle I faced in developing the new process is, handling cross sub-systems connections. 
As we do cross sub-systems data migrations, Source and Target databases can be on different sub-systems. Location transparency using three-part names is not enabled in our Mainframes setup. And, REXX can't handle connections to multiple DB2 sub-systems at same time. So, I need to open a connection to DB2 sub-system, do the processing, close the connection and then open new connection to other DB2 sub-system.
I coded separate PARAs in the REXX routine to handle the cross connections.


The second hurdle I faced in developing the new process is, batch job's (that executes the REXX routine) performance. 
First version of my REXX routine worked as below (in the exact order).
          1. REXX routine reads input (Source and Target database names)
          2. Connect to Source DB2 sub-system
          3. Get TS, TB, IX, Columns details into arrays for the Source Database
          4. Close connection to Source DB2 sub-system
          5. Connect to Target DB2 sub-system
          6. Get TS, TB, IX, Columns details into arrays for the Target Database
          7. Close connection to Target DB2 sub-system
          8. Compare TS' details and write mismatches to output
          9. Compare TB' details and write mismatches to output
          10. Compare IX' details and write mismatches to output
          11. Compare Columns' details and write mismatches to output

I coded simple DO loops in the COMPARE_PARA, which worked fast for TS, TB & IX comparisons, but the performance took a worst blow for Columns' comparison (there are 50,000+ Columns in every database). We have regular REXX and System REXX (that come by default with z/OS installation), but don't have a REXX compiler which could have helped in making REXX execution faster. The job started taking hours for completion, which is not at all expected. 


After going through the great suggestions I got for my posts in LinkedIn, re-coded my REXX routine replacing the DO loop in COMPARE_PARA with call to ISRSUPC. Second version of my REXX routine has the below new steps included into it.
         1. Allocate NEWDD, OLDDD, OUTDD datasets
         2. Read the TS' details of Source database into NEWDD file
         3. Read the TS' details of Target database into OLDDD file
         4. Setting PARM values to DELTAL,LINECMP
         5. Execute the blow statement
             ADDRESS ISPEXEC "SELECT PGM(ISRSUPC) PARM("PARMS")"
         6. Process OUTDD dataset to get details of mismatches. 
         7. Free NEWDD, OLDDD, OUTDD datasets.

The same order is followed for TB, IX and Columns details comparison.
ISRSUPC did the magic. The job that took hours for completion started completing in minutes.


The JCL used for REXX routine execution is below.
JOBLIB datasets, Input Database names and SYSUT2 dataset name change based on requirement.
//PDB00000 JOB (0000,00000000),'OBJECT COMPARE',             
//         CLASS=E,MSGCLASS=X,NOTIFY=&SYSUID,REGION=0M       
//JOBLIB   DD  DSN=EGP1X.SDSNEXIT,DISP=SHR                   
//                 DD  DSN=DSNEGP1.SDSNLOAD,DISP=SHR                 
//                 DD  DSN=EGQ1X.SDSNEXIT,DISP=SHR                   
//                 DD  DSN=DSNEGQ1.SDSNLOAD,DISP=SHR                 
//STEP010  EXEC PGM=IKJEFT01,DYNAMNBR=10                     
//ISPMLIB  DD DISP=SHR,DSN=ISP.SISPMENU                      
//                  DD DISP=SHR,DSN=ISF.SISFMLIB                      
//ISPSLIB   DD DISP=SHR,DSN=ISP.SISPSENU                      
//                  DD DISP=SHR,DSN=ISP.SISPSLIB                      
//ISPTLIB   DD DISP=SHR,DSN=ISP.SISPTENU                      
//                  DD DISP=SHR,DSN=ISF.SISFTLIB                      
//ISPPLIB   DD DISP=SHR,DSN=ISP.SISPPENU                      
//                 DD DISP=SHR,DSN=ISF.SISFPLIB                      
//ISPLOG   DD DCB=(RECFM=VA,LRECL=125,BLKSIZE=129),SYSOUT=*  
//ISPLIST   DD DCB=(RECFM=FBA,LRECL=121,BLKSIZE=1210),SYSOUT=*
//ISPPROF  DD RECFM=FB,LRECL=80,SPACE=(TRK,(2,2,2))          
//SYSPROC  DD  DISP=SHR,DSN=RFO7936.PROCAUTO.OBJCOMP         
//SYSTSPRT DD  SYSOUT=*                                      
//SYSOUT   DD  SYSOUT=*                                      
//SYSTSIN  DD  *                                   
 ISPSTART CMD(%RXOBJCMP LPALIF LQALTAX)            
//SYSUT1   DD DUMMY                               
//SYSUT2   DD DSN=RFO7936.SUPC.OUTP,               
//            DISP=(NEW,CATLG,DELETE),UNIT=DISK,   
//            DCB=(RECFM=FB,LRECL=80,BLKSIZE=3600),
//            SPACE=(CYL,(5,5),RLSE)               
//*

Finally, I'm getting the desired output within minutes by submitting just a JCL *WINK*