Problem solve Get help with specific problems with your technologies, process and projects.

In search of a better Lookup?

Here's an easier way to do RPG opcode lookup searches.

Imagine you are looking up the word "slow" in a dictionary. You start at the first word on the first page and begin...

checking sequentially: a, aardvark, aardwolf... Eventually, after a few days, you find the word you're looking for. Sounds crazy? Yet this is exactly how the RPG opcode LOOKUP searches a sorted array! (Prior to V5R1, that is.)

Is there a better approach? For an unsorted array, there is no alternative to plodding through the array, an element at a time. But for a sorted array, a much faster technique known as binary search can be used. In dictionary terms, this involves opening the dictionary in the middle and comparing the search word to the middle word. If you've got a hit, the search is over. Failing that, if the search word is greater than the middle word, open the dictionary 3/4 of the way through, otherwise 1/4 of the way. Continue halving the search interval in this way until you get a match.

For a typical dictionary, with, say, 100000 entries, binary search entails at most 17 comparisons, versus 99999 for a serial search. (A person can often find a word even faster than this, by making use of the known distribution of words in an English dictionary. This type of extra information is not always available to a program.)

Although binary search is conceptually a simple algorithm, coding a bulletproof implementation of it is surprisingly tricky. (Try it if you don't believe me!) Fortunately, we don't need to reinvent the wheel; a robust implementation is already available in the C runtime library. This is available to ILE RPG programmers, even if the C compiler is not loaded. Simply specify binding directory QC2LE. This includes service program QC2UTIL1, which contains procedures qsort and bsearch.

Prior to calling binary search, the program below invokes the QuickSort function from the C runtime library. QuickSort offers a number of advantages over the SORTA opcode. Firstly, it can sort part of an array. This is especially useful when you don't know at compile-time how many elements your array will contain, and avoids the need for setting unused elements to *HIVAL, or similar trickery. It can also be used to sort a dynamically allocated array that does not have all its defined elements allocated. Finally, we can specify a user-defined collating sequence; see below.

In order to use QuickSort and binary search, we must pass the information needed to access the array: its address, number of elements and element size in bytes. We also pass a pointer to a user-written "compare" function. QuickSort and binary search invoke this to define a collating sequence for the array elements and the search item. Note that all knowledge of the type and format of the array elements is encapsulated in this function. The function could perform a simple hex comparison, as in function Compare, below. Or it could use a case-insensitive comparison. The concept is infinitely flexible. Note that QuickSort and binary search must use the same (or equivalent) comparison function.

We must also pass binary search the address of the search item itself. (The search item should be the same length as an element of the array.) Binary search returns a pointer to the first matching array element, or a null pointer if there is no match. To calculate the matching array index requires some simple pointer arithmetic. In the example below, the search index is set to the first matching array index, or zero on no match.

So when should binary search be preferred to an RPG lookup? For arrays smaller than, say, 50 or 100 elements, a simple lookup is faster and easier. For large arrays that are searched frequently, binary search is much, much faster. (Think of the dictionary analogy.) I have used it on a 2500 element array, and found it roughly 20 times faster than a lookup. If you are searching such an array many times, binary search is definitely the way to go. And after all, once you have used binary search once, it's easy to incorporate into other programs.

Finally, V5R1 RPG offers the %Lookup BIF, which will use the binary search algorithm on all or part of a sorted array. However, the SORTA opcode can still sort only the whole of an array, not just the first few elements. So the C library functions offer more flexibility, but SORTA and %Lookup are undoubtedly simpler, where they are applicable.


Binary search example:

 
H Dftactgrp(*NO)
H BndDir('QC2LE')

 * Prototype for C runtime library function QuickSort:
D SortA           pr                  Extproc('qsort')
D  ArrayPtr                       *   Value
D  ElemCount                    10u 0 Value
D  ElemSize                     10u 0 Value
D  ComparePtr                     *   ProcPtr Value

 * Prototype for C runtime library function binary search:
D Lookup          pr              *   ExtProc('bsearch')
D  SearchItemPtr                  *   Value
D  ArrayPtr                       *   Value
D  ElemCount                    10u 0 Value
D  ElemSize                     10u 0 Value
D  ComparePtr                     *   ProcPtr Value

 * Prototype for array element comparison function,
 * used by both QuickSort and binary search:
D Compare         pr            10i 0
D  Element1Ptr                    *   Value
D  Element2Ptr                    *   Value

D TestArray       s             15    Dim(1000)
D ActiveElements  s             10u 0
D SearchItem      s                   Like(TestArray)
D SearchPtr       s               *
D SearchIndex     s             10u 0
 * Load the first 
 
   elements of TestArray.
 * Note: ActiveElements is set within LoadArray.
 * (Must ensure 0 <= ActiveElements <= 1000.)
C***                Exsr      LoadArray

 * Sort the active elements of the array. (First-time only!)
C                   CallP     SortA ( %Addr(TestArray) :
C                                     ActiveElements :
C                                     %Size(TestArray) :
C                                     %PAddr('COMPARE') )
 * Call binary search.
C                   Eval      SearchItem = 'Find this'
C                   Eval      SearchPtr =
C                             Lookup ( %Addr(SearchItem) :
C                                      %Addr(TestArray) :
C                                      ActiveElements :
C                                      %Size(TestArray) :
C                                      %PAddr('COMPARE') )
 * Derive search index from pointer to first matching element.
C                   If        SearchPtr = *NULL
C                   Eval      SearchIndex = 0
C                   Else
C                   Eval      SearchIndex = 1 +
C                                ((SearchPtr - %Addr(TestArray)) /
C                                 %Size(TestArray))
C                   Endif
C                   Eval      *INLR = *ON
 *--------------------------------------------------------------
 * Array element comparison function: ascending sequence.
 *--------------------------------------------------------------
P Compare         b
D Compare         pi            10i 0
D  Element1Ptr                    *   Value
D  Element2Ptr                    *   Value
D Element1        s                   Based(Element1Ptr) Like(TestArray)
D Element2        s                   Based(Element2Ptr) Like(TestArray)
C                   Select
C                   When      Element1 < Element2
C                   Return    -1
C                   When      Element1 > Element2
C                   Return    1
C                   Other
C                   Return    0
C                   Endsl
P Compare         e 
 

==================================
MORE INFORMATION ON THIS TOPIC
==================================

The Best Web Links: tips, tutorials and more.

Ask your programming questions--or help out your peers by answering them--in our live discussion forums.

Ask the Experts yourself: Our application development gurus are waiting to answer your programming questions.

This was last published in February 2002

Dig Deeper on iSeries CL programming

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.

-ADS BY GOOGLE

SearchDataCenter

Close