Today we're obsessed with getting new techniques into our programs as fast as we can. They're cool and neat, we...
have to do it. On the IBM i, programmers are migrating to using only SQL in their programming – the downside is that if you want the system to handle like a Ferrari, you need to talk to the gas pedal directly.
What raw RPG programming has going for it is the speed in which it can read a file. Millions of records can be accessed in a short period of time using simple methods we all grew to know in the old days. I'd like to help you understand how operations work within an RPG program so you can decide which method you need to use.
Primary file programs are fast to program -- you can do wonderful things with a five line program. For example, to do the following task "Read all records in TESTFILE, and copy all records with an "A" in FIELD5 into "NEWFILE" (assuming all fields in NEWFILE are exactly the same) would look like:
The brevity of this code is due to the built in functionality of the "Primary" file from yesteryear that allows RPG to handle a great deal of data all at once. Also note that this file is being read via the order of appearance of the records, not where they are located per the index of the file.
Speeding through data with indexing
The advantage of the built in functionality is that RPG can handle data faster than other options. We can position randomly in the file using the powerful built-in indexing. This OS level functionality gives the ability to randomly traverse the indexes and deliver data quickly. It's this sort of index skipping that allows you to position to areas in the file you'd rather be in – kind of like lifting the needle of a record player (for you old folks). You must also strategically create your keys (or indexes) for the file so you can utilize them as many times as you can. Who wants to create an index for a single purpose? It's why I discourage "select/omit" indexes for single purpose reports or functions.
Looking at the following program, you'll see where the speed can be lost – after the explanation of how it's working. We'd like to read through 1,000 records relating to the key "ATSALOT".
In this case, the system is reading from an index of ascending values for Field 2. Using the "CHAIN" operation – the system literally says "I only want one!" so the buffer is only grabbing one record. The iSeries has "SMART" caching where the system predicts the part of the file you are working on, and using "CHAIN" causes the system to only get one record at a time. CHAIN is a combination of "SETLL" or set lower limits, and "READEQ" or get one record that is equal to that specific key for the record that is found at that specific index position. This isn't going to win races when trying to read a mass amount of records (say 10,000 to 100,000 and beyond), but is fine for loading records for a subfile or 12 to 20 records. So, what would we need to do to increase this efficiency? Well, we could use the following combination to do exactly the same thing:
Why? In this case, we positioned in the file to the first record and then used READ instead of READEQ or CHAIN. Chain and READEQ do essentially the same thing in this example. READEQ would utilize where it's positioned in the file, but here we have 1,000 consecutive numbers. There's slightly more code processing in the program, but remember I/O 100 times slower than code when processing. So even though the program is literally doing the work of the READEQ to process the records, the program is using READ. The system is now in a "Hey this person wants to read a lot of records… So, I'm going to buffer a lot of records" mode. The next time it asks for a record to process, READ will look at memory where the records may have been read.
Could this have been processed using SQL? Certainly. But SQL has it's own nuances. When I process anything using SQL, I make sure to utilize the indexes properly, or else the system generates those indexes on the fly after every use. Also make sure that you've created your least complex index first. This way, if the system can utilize the index from the initial less complex index, the system will use less storage for the index and find records quicker, etc.
So, what does this teach us? Not all programs are as they seem. When you look at older programs, and you see the above technique, think twice before re-writing to use SQL for the I/O, unless you need to have late binding (i.e. to avoid level problems).
ABOUT THE AUTHOR: Andrew Borts is webmaster at United Auto Insurance Group in North Miami, Fla. He is a frequent speaker at COMMON and is past president of The Southern National Users Group, an iSeries-AS/400 user group based in Deerfield Beach, Fla.