![]() |
|
| John Kohan | |
Recently, I was given the task of matching customer addresses during order entry. Order entry at my company is achieved via EDI, Web site and traditional hand keying. What made this a challenge was the different ways people enter addresses. Some enter abbreviations, while others spell out the word. In addition, because there are multiple lines for the address, the first line may contain the street number or a comment.
As I began to review some addresses we had already received, it became apparent that matching a name was also impossible. There are so many deviations in name spelling; it really did not impact on the fact if two addresses were the same physical address.
The first step I took was to make sure the address lines were what I expected. Therefore, I removed all leading blanks. Once this was complete, I converted the address lines from lower case to upper case.
C 1 DO 4 X 30 C ' ' CHECK@ADD,X:1 N 20 C SUBSTADD,X:N ADD,X P C ENDDO C LC:UC XLATE@ADD,X @WORK
Next, I looked at which line of the address contained the street address. I needed this because some customers may place purchase order numbers, comments or other non-location reference information in the multiple lines of the address. I did this by searching for a number in the first position.
C 1 DO 4 X 30 C MOVELADD,X @FRST 1 C NBRCON CHECK@FRST 70 C *IN70 IFEQ *OFF C LEAVE C ENDIF C ENDDO
Now that I have located the address line within the four lines of available address, I needed to extract the words and normalize them if possible. First, I found each of the words in the line. Then I needed to search a database that stores substitute phrases, such as "ST" for "STREET".
C** Start from the first char C Z-ADD1 S 30 C Z-ADD1 E 30 C Z-ADD1 L 30 C** Extract each of the words out C *IN70 DOUEQ*OFF C ' ' SCAN ADD,Z:S E 70 C *IN70 IFEQ *ON C E SUB S L C** If we are all done, get out C L IFEQ *ZEROS C LEAVE C ENDIF C** Extract a part of the Address C L SUBSTADD,Z:S @PARSE 20 C** Setup the spacing in the name field C S IFEQ 1 C Z-ADD0 X 30 C ELSE C Z-ADD1 X C ENDIF C** See if a substitution exists C @PARSE IFNE *BLANKS C @PARSE CHAINRSUBSTUT 80 C *IN80 IFEQ *OFF C CAT SSTO:X @ADDR 60 C ELSE C CAT @PARSE:X @ADDR C ENDIF C ENDIF C** Start the next search at the next char C E ADD 1 S C* C ENDIF C ENDDO
This now has the address that is normalized so I can check to see if I have another just like it so that special processing may occur. One could argue that removing all the spaces from the address now would produce a better comparison. I chose not to do that because it would make little difference in my needs.
In my address file, the city-state and zip code fields are segregated from the lines of address. The extraction of these was not needed, but it could be easily achieved using the technique above.
The code above came from three programs I created. The example is in RPG/400, since that is the "Shop Standard". These three programs use array processing so they can be used a modular fashion. I load the new address into the array and format it. Next, I load the address I wish to compare to and format it. Then the only part left is the actual "IF" statement to test if they are the same.
Although your address needs may be different, my intention is to offer a solution that will foster thought and ultimately help you find the solution you need.
-----------------------------------------
About the author: John Kohan is a senior programmer analyst at CT Codeworks.
==================================
MORE INFORMATION
==================================
- Use trigger programs to track record changes
One of the issues programmers have to deal with is recording when a record was changed. Since we now have users accessing data from RPG programs, Cobol programs, DFU, Microsoft Access, Java, etc., it's getting harder to keep track of all the changes and who actually performed them. Trigger programs are one way for you to track these changes. - Don't proliferate old, bad code
In an effort to hasten the completion of projects, most programmers "steal" code from other sources within their libraries. The problem is that many times, the code originated years ago and is cumbersome, lengthy and not easy to maintain. Even most ILE code resulted from old code that was passed through a conversion utility. Using the functionality of RPGIV's Keywords and Built-In-Functions usually can make the code much smaller and manageable. - Search400's Best Developer Web Links: tips, tutorials and more.
- Ask your programmer questions -- or help out your peers by answering them -- in our live discussion forums.
- Ask the Experts yourself: Our programmer gurus are waiting to answer your technical questions.
This was first published in May 2002
