The work with active jobs command (WRKACTJOB) is one of the most frequently used and widely known commands on the iSeries, yet few of us have taken the time to discover all the benefits it has to offer. Worse still, we often use it impulsively without realizing the tremendous impact it has on the system.
Why the WRKACTJOB command eats up CPU
It is important to understand just what the WRKACTJOB command goes through each time you execute it. A great way to visualize this -- at least at its most basic level -- is to use the work management APIs as an analogy. To just list the names of the active subsystems and jobs, a program would have to do the following:
- Execute the list active subsystems API (QWCLASBS) to produce a list of all the subsystems currently active on the system.
- Call the list job API (QUSLJOB) to produce a list of all active jobs.
- For each entry in the list produced in step 2, call the retrieve job information API (QUSRJOBI) to retrieve specific information on each active job.
And that is just the basics. When you factor in the CPU utilization figures, job and subsystem status indicators, and all the other things the WRKACTJOB command "touches," you have a command that eats up the CPU every time you run it or hit the refresh key (F5).
That may not cause much of an impact on a huge system, but on a heavily used small to medium-sized system, it can essentially halt the system momentarily while it is collecting all its data.
How to make your own "more efficient" version of WRKACTJOB
It is unfortunate that the most common use of the WRKACTJOB command is to determine which job is devouring system resources when the system bogs down. Unfortunate not because the command doesn't do a good job at locating runaway jobs -- it does -- but because it offers so much more. But as long as we are going to use it to spot errant jobs, we should at least use it wisely.
Here is a great way to improve both the efficiency of the WRKACTJOB command as well as your efficiency in getting data from it. The first step is to create a duplicate of the command object in a library that is on your (and the system operator's) library list (we use QGPL). Use the following command:
CRTDUPOBJ OBJ(WRKACTJOB) FROMLIB(QSYS) OBJTYPE(*CMD) TOLIB(QGPL) NEWOBJ(FINDBADJOB)
This will create a copy of the WRKACTJOB command called FINDBADJOB. The duplicate will be placed in library QGPL. Now, to customize the new command to fit our own needs, we can use the following command or something similar:
CHGCMDDFT CMD(FINDBADJOB) NEWDFT('RESET(*YES) SBS(QBATCH) CPUPCTLMT(40) SEQ(*CPUPCT)')
The CHGCMDDFT command lets you modify default values of command parameters. In the example above, we are specifying that we want the statistics reset so we are not looking at old data.
We also know from experience that most runaway jobs are batch jobs (running in QBATCH) and consume more than 40% of the CPU. We set the SBS parameter to QBATCH; this will show us only jobs that are running in QBATCH. We also set the CPUPCTLMT parameter to 40, thus showing us only those jobs consuming 40% or more of the CPU. You may want to change those two values to reflect your system's configuration. For example, if you have more than one batch subsystem, you will want to leave the original default value of *ALL for the SBS parameter intact.
Finally, for good measure, we set the SEQ parameter to *CPUPCT. This will sort the resulting list by CPU percent in descending order. Obviously, in this example, it is hardly necessary because we can never have more than two entries listed. It is shown here for illustration purposes only.
Shown is an example of the display resulting from the FINDBADJOB command. The "99.9" shown for the system CPU% (upper left corner) is a clear indication that one or more jobs may be caught in an infinite loop. The 74.0 shown for the CPU% of the appropriately named BAD_JOB job pinpoints the suspected runaway job. Note: When the screen is first displayed, no jobs are listed because the statistics have been reset to zeros. You must press F5 (Refresh) after a few seconds to update the statistics (note the "00:00:08" for the elapsed time at the top center of the screen).
|Figure 1: Example of the display resulting from the FINDBADJOB comand.|
Taking option 10 next to the suspect job will display the call stack, which will allow us to determine if the job or program is looping (note, the refresh key on the Display Call Stack display is F10, not F5). If we determine that the program is looping and needs to be ended, we can use option 4 to end it. Tip: After typing the number 4 next to the runaway job, press F4 to prompt the ENDJOB command and change OPTION parameter (i.e., How to end) from *CNTRLD to *IMMED.
More on the WRKACTJOB command
FINDBADJOB is just one important use of the WRKACTJOB command. In the next installment we'll look at how to use this old workhorse of a command as an aid in tuning both the system and individual jobs and programs.
About the author: Ron Turull is editor of Inside Version 5. He has more than 20 years experience programming for and managing AS/400-iSeries systems.
- Three basic system tools to help you tune your iSeries
The best time to use the performance management tools offered on the iSeries is before crisis erupts. More often than not, however, performance management is used only in crisis situations, when the system grinds to a halt and you need to determine the cause and remedy. That means these tools are often left idle during the "normal" times when they could instead be used to create a more efficient system.
- Monitor your job
Want to know when a batch job is in a "MSGW" (Message Wait) status in QBATCH? Search400.com expert John Kohan shows you how to find out.
- Proper way to end WRKACTJOB
Most iSeries (AS/400) users do not use the correct procedure to end the WRKACTJOB command. The common method used involves the user pressing the "F3" or the "F12" command keys. This is incorrect.