Imagine you are looking up the word "slow" in a dictionary. You start at the first word on the first page and begin...
checking sequentially: a, aardvark, aardwolf... Eventually, after a few days, you find the word you're looking for. Sounds crazy? Yet this is exactly how the RPG opcode LOOKUP searches a sorted array! (Prior to V5R1, that is.)
Is there a better approach? For an unsorted array, there is no alternative to plodding through the array, an element at a time. But for a sorted array, a much faster technique known as binary search can be used. In dictionary terms, this involves opening the dictionary in the middle and comparing the search word to the middle word. If you've got a hit, the search is over. Failing that, if the search word is greater than the middle word, open the dictionary 3/4 of the way through, otherwise 1/4 of the way. Continue halving the search interval in this way until you get a match.
For a typical dictionary, with, say, 100000 entries, binary search entails at most 17 comparisons, versus 99999 for a serial search. (A person can often find a word even faster than this, by making use of the known distribution of words in an English dictionary. This type of extra information is not always available to a program.)
Although binary search is conceptually a simple algorithm, coding a bulletproof implementation of it is surprisingly tricky. (Try it if you don't believe me!) Fortunately, we don't need to reinvent the wheel; a robust implementation is already available in the C runtime library. This is available to ILE RPG programmers, even if the C compiler is not loaded. Simply specify binding directory QC2LE. This includes service program QC2UTIL1, which contains procedures qsort and bsearch.
Prior to calling binary search, the program below invokes the QuickSort function from the C runtime library. QuickSort offers a number of advantages over the SORTA opcode. Firstly, it can sort part of an array. This is especially useful when you don't know at compile-time how many elements your array will contain, and avoids the need for setting unused elements to *HIVAL, or similar trickery. It can also be used to sort a dynamically allocated array that does not have all its defined elements allocated. Finally, we can specify a user-defined collating sequence; see below.
In order to use QuickSort and binary search, we must pass the information needed to access the array: its address, number of elements and element size in bytes. We also pass a pointer to a user-written "compare" function. QuickSort and binary search invoke this to define a collating sequence for the array elements and the search item. Note that all knowledge of the type and format of the array elements is encapsulated in this function. The function could perform a simple hex comparison, as in function Compare, below. Or it could use a case-insensitive comparison. The concept is infinitely flexible. Note that QuickSort and binary search must use the same (or equivalent) comparison function.
We must also pass binary search the address of the search item itself. (The search item should be the same length as an element of the array.) Binary search returns a pointer to the first matching array element, or a null pointer if there is no match. To calculate the matching array index requires some simple pointer arithmetic. In the example below, the search index is set to the first matching array index, or zero on no match.
So when should binary search be preferred to an RPG lookup? For arrays smaller than, say, 50 or 100 elements, a simple lookup is faster and easier. For large arrays that are searched frequently, binary search is much, much faster. (Think of the dictionary analogy.) I have used it on a 2500 element array, and found it roughly 20 times faster than a lookup. If you are searching such an array many times, binary search is definitely the way to go. And after all, once you have used binary search once, it's easy to incorporate into other programs.
Finally, V5R1 RPG offers the %Lookup BIF, which will use the binary search algorithm on all or part of a sorted array. However, the SORTA opcode can still sort only the whole of an array, not just the first few elements. So the C library functions offer more flexibility, but SORTA and %Lookup are undoubtedly simpler, where they are applicable.
Binary search example:
H Dftactgrp(*NO) H BndDir('QC2LE') * Prototype for C runtime library function QuickSort: D SortA pr Extproc('qsort') D ArrayPtr * Value D ElemCount 10u 0 Value D ElemSize 10u 0 Value D ComparePtr * ProcPtr Value * Prototype for C runtime library function binary search: D Lookup pr * ExtProc('bsearch') D SearchItemPtr * Value D ArrayPtr * Value D ElemCount 10u 0 Value D ElemSize 10u 0 Value D ComparePtr * ProcPtr Value * Prototype for array element comparison function, * used by both QuickSort and binary search: D Compare pr 10i 0 D Element1Ptr * Value D Element2Ptr * Value D TestArray s 15 Dim(1000) D ActiveElements s 10u 0 D SearchItem s Like(TestArray) D SearchPtr s * D SearchIndex s 10u 0 * Load the first
elements of TestArray. * Note: ActiveElements is set within LoadArray. * (Must ensure 0 <= ActiveElements <= 1000.) C*** Exsr LoadArray * Sort the active elements of the array. (First-time only!) C CallP SortA ( %Addr(TestArray) : C ActiveElements : C %Size(TestArray) : C %PAddr('COMPARE') ) * Call binary search. C Eval SearchItem = 'Find this' C Eval SearchPtr = C Lookup ( %Addr(SearchItem) : C %Addr(TestArray) : C ActiveElements : C %Size(TestArray) : C %PAddr('COMPARE') ) * Derive search index from pointer to first matching element. C If SearchPtr = *NULL C Eval SearchIndex = 0 C Else C Eval SearchIndex = 1 + C ((SearchPtr - %Addr(TestArray)) / C %Size(TestArray)) C Endif C Eval *INLR = *ON *-------------------------------------------------------------- * Array element comparison function: ascending sequence. *-------------------------------------------------------------- P Compare b D Compare pi 10i 0 D Element1Ptr * Value D Element2Ptr * Value D Element1 s Based(Element1Ptr) Like(TestArray) D Element2 s Based(Element2Ptr) Like(TestArray) C Select C When Element1 < Element2 C Return -1 C When Element1 > Element2 C Return 1 C Other C Return 0 C Endsl P Compare e
MORE INFORMATION ON THIS TOPIC
The Best Web Links: tips, tutorials and more.
Ask your programming questions--or help out your peers by answering them--in our live discussion forums.
Ask the Experts yourself: Our application development gurus are waiting to answer your programming questions.