Revisit your database naming conventions

Programming guru Paul Tuohy points out that your database naming conventions could be causing problems. Read how he says it should be done.




There is something that has been annoying me for quite a while. It is one of those niggling little things that

we have been doing for so long that we take it for granted. We do it for historical reasons and take it for granted that that is the way we have to do it. I am talking about the naming conventions we use on our databases.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

      * Field Reference File (FLDREF)
 
                R FLDREFR     
                  CUSN           8          TEXT('Customer Code')
                  NAME          30          TEXT('Supplier')
 
 
      * Customer File (CUSFIL)
                                            REF(FLDREF)
                R CUSFILR
                  CSCUSN    R               REFFLD(CUSN)
                  CSNAME    R               REFFLD(NAME)


Figure 1: Field Reference File and referencing Physical File

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Figure 1 shows a snippet of what I am talking about. We have a field reference file (FRF) and a physical file (PF) that refers to it. The field names on the FRF are a maximum of four characters long because we use a unique two character prefix on the PF to identify the file to which the field belongs. Does this look familiar? OK, maybe your database uses a suffix instead of a prefix.


Paul Tuohy

The length of the field names had to do with good old RPG III where the maximum length of a field name could not exceed six characters. We all know that RPG IV can handle the full 10-character field name and, of course, we are starting to use these in our databases.

But why the need for a unique identifier? The reason being that all fields in an RPG program are global, so we need to know which file a field came from and we need to ensure that input from one file does not accidentally overwrite input from another file -- which is what would happen if we had two files with fields of the same name.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

      * Customer File (CUSTOMER)
                                            REF(FIELDREF)
                R CUSTOMERR
                  CUSTCODE  R
                  NAME      R
 
 
      * Supplier File (SUPPLIER)
                                            REF(FIELDREF)
                R SUPPLIERR
                  SUPPCODE  R
                  NAME      R



Figure 2: DDS for two files with the same name.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Figure 2 shows the DDS for two files (a Customer File and a Supplier File), both of which contain a field called NAME.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

     FCustomer  IF   E           K Disk                                                             
     FSupplier  IF   E           K Disk                                                             
             
               Read Customer;
               Read Supplier;



Figure 3: RPG IV Program reading records with duplicate field names.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Figure 3 shows an RPG program that opens these two files and reads a record from each. When the program reads a record from the Customer file it will place a value in the NAME field, then when it reads a record from the Supplier file it will overwrite the value in the NAME field -- hence the requirement for a unique identifier on each file.

This seems like a very reasonable premise, but it is using a very large hammer to crack a very small nut and leads us into some coding practices that may cause larger problems then the one they aim to cure.

Let us start with the database. It is a relational database. It is a normalized relational database. This means that a column (field) should be defined in one table (physical file) and one table only -- the only exception being "key," fields which are used to identify relationships, e.g. an invoice number on both the invoice header and invoice detail files.

The instance shown in Figures 2 and 3 would be the result of a badly designed database. The Customer name and Supplier name are not the same entities -- although they are both names. Think of NAME as being a data type as opposed to a field and it should have been taken care of in the field reference file, as shown in Figure 4. The NAME field is defined as a "generic" name, and the actual fields (CUSNAME and SUPPNAME) refer back to it for their definitions.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

      * Field Reference File (FLDREF)
 
                R FLDREFR     
                  NAME          30          TEXT('Generic Name')
 
                  CUSNAME   R               REFFLD(NAME *SRC)
                                            TEXT('Customer Name')
                  SUPPNAME  R               REFFLD(NAME *SRC)
                                            TEXT('Supplier Name')



Figure 4: RPG IV Program reading records with duplicate field names.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

So, in an RPG program the only fields that should be "double defined" are key reference fields, and more then likely we will have the same value in them, since we use them to link between files (e.g. use a CHAIN operation to get the record from the invoice header and use a READE loop to read all corresponding records from the invoice details file). No big benefit here, from unique identifiers.

On the other hand, think of the problems that unique identifiers cause. We have all had that old chestnut during testing where we can't understand why we are getting that duplicate record error until we notice that we are writing records with a blank key field -- because we forgot to move the "parent" key field to the "child" key field. It would not have been a problem if we were using the same field name on both files.

Worse still, what about using unique field names on display files and print files -- now we have to move all the fields, not just key reference fields. You just have to remember that when you add a new field to a file and any corresponding display files, you must put in the appropriate MOVE operations as well.

There seems to be much less code when we are not using unique identifiers.

Now, there are some cases where we might have a problem with the same field name on multiple files. After all, what we do in our programs can often be weird and wonderful. The example in Figures 2 and 3 could well arise if the two files came from applications provided by different software vendors.

If you need unique field names, why not try one of the following:

  1. Define a logical file, which renames the fields that you want to use (as shown in Figure 5). This is what we should have been doing for RPG III programs!
  2. Use the PREFIX keyword on the F Spec (as shown in Figure 6). This accomplishes exactly what one was trying to achieve with unique identifiers on the database.
  3. Use a QUALIFIED data structure (requires V5R1 or later) (as shown in Figure 6).

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

      * Supplier File (SUPPLIER)
                                            REF(FIELDREF)
                R SUPPLIERR
                  SUPPCODE  R
                  NAME      R
 
      * Logical View of SUPPLIER file to rename NAME field
 
                R SUPPLIERM                 PFILE(SUPPLIER)
                  SUPPCODE
                  NEWNAME                   RENAME(NAME)



Figure 5: Renaming a field in a logical view.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

     FSupplier  IF   E           K Disk    Prefix(PT)                                               
                         
     D Record        E DS                  ExtName(Supplier)                                        
                         
            Chain Key  Supplier Record;                   



Figure 6: Renaming a field in a logical view.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

I was always taught that my design should handle what normally happens and that I should code for the exception. In RPG a clash of field names should be the exception and not the norm. By having unique identifiers on our database field names, we are treating these exceptions as if they are the norm, and that in turn forces us to have extra code in nearly all our programs to handle the unique naming.

---------------------------
About the author: Paul Tuohy is CEO of ComCon, an iSeries consulting company. He is the author of Re-Engineering RPG Legacy Applications and is one of the quoted industry experts in the IBM Redbook "Who Knew You Could Do That With RPG IV?"

==================================
MORE INFORMATION
==================================

  • Display duplicate fields command
    Search400.com member Martin Cubitt provides a tool that will compare two physical files and report common fields. This is useful, he says, when a program uses two files and you need to check which fields could potentially be overwritten in the program.
  • Processing database records
    A user asks, "If I have a database, how can I select the records I want to process based on the value in a given field in the table? I have an RPG program that processes all records in the table correctly, but I want to limit the records processed by the company field." Site expert Jim Mason helps out.
  • DB2/400 field level security
    A user asks, "I'm currently running V4R5 on a model 720. I'm using Crystal Report Writer and want to apply security in DB2/400 at the field level. For instance, I want departments to have access to their own payroll data, but no one else's. Is this possible?" Yes, says security expert Paul Jury, there are a couple ways.


This was first published in March 2004

Dig deeper on iSeries CL programming

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

SearchEnterpriseLinux

SearchDataCenter

Close