Evaluate Weigh the pros and cons of technologies, products and projects you are considering.

Denormalization of DB2 tables

As a last resort, you can increase performance by denormalizing tables.

This tip is excerpted from a longer article on the InformIT Web site.

Speeding the retrieval of data from DB2 tables is a frequent requirement for DBAs and performance analysts. One way to accomplish this is to denormalize DB2 tables. The opposite of normalization, denormalization is the process of putting one fact in many places. This speeds data retrieval at the expense of data modification. This is not necessarily a bad decision, but should be undertaken only when a completely normalized design will not perform optimally. Consider these issues before denormalizing:

  • Can the system achieve acceptable performance without denormalizing?
  • Will denormalization render the database design unusable for ad hoc queries (that is, specialized expertise required to code queries against the denormalized design)?
  • Will the performance of the system still be unacceptable after denormalizing?
  • Will the system be less reliable due to denormalization?

If the answer to any of these questions is "yes," you should not denormalize your tables because the benefit will not exceed the cost. If, after considering these issues, you decide to denormalize, there are rules you should follow.

  • If enough DASD is available, create the fully normalized tables and populate denormalized versions using the normalized tables. Access the denormalized tables in a read-only fashion. Create a controlled and scheduled population function to keep denormalized and normalized tables synchronized.
  • If sufficient DASD does not exist, maintain denormalized tables programmatically. Be sure to update each denormalized table representing the same entity at the same time; alternatively, provide a rigorous schedule whereby table updates are synchronized. If you cannot avoid inconsistent data, inform all users of the implications.
  • When updating any column that is replicated in many tables, update all copies simultaneously, or as close to simultaneously as possible given the physical constraints of your environment.
  • If denormalized tables are ever out of sync with the normalized tables, be sure to inform users that batch reports and online queries may not show up-to-date information.
  • Design the application so that it can be easily converted from denormalized tables to normalized tables.

There is only one reason to denormalize a relational design: performance. Several indicators help identify systems and tables that are candidates for denormalization. These indicators follow:

  • Many critical queries and reports rely on data from more than one table. Often these requests must be processed in an online environment.
  • Repeating groups must be processed in a group instead of individually.
  • Many calculations must be applied to one or many columns before queries successfully can be answered.
  • Tables must be accessed in different ways by different users during the same timeframe.
  • Many large, primary keys are clumsy to query and use a large amount of DASD when carried as foreign key columns in related tables.
  • Certain columns are queried a large percentage of the time. (Consider 60 percent or greater as a cautionary number flagging denormalization as an option.)

Many types of denormalized tables work around the problems caused by these indicators. Table 3.5 summarizes the types of denormalization, with a short description of when each type is useful. The sections that follow describe these denormalization types in greater detail.

Types of denormalization

Denormalization Use
Prejoined Tables When the cost of joining is prohibitive
Report Tables When specialized critical reports are needed
Mirror Tables When tables are required concurrently by two types of environments
Split Tables When distinct groups use different parts of a table
Combined Tables When one-to-one relationships exist
Redundant Data To reduce the number of table joins required
Repeating Groups To reduce I/O and (possibly) DASD
Derivable Data To eliminate calculations and algorithms
Speed Tables To support hierarchies

Read more about the different types of denormalization at InformIT.

For More Information

  • Feedback: E-mail the editor with your thoughts about this tip.
  • More tips: Hundreds of free DB2 tips and scripts.
  • Tip contest: Have a DB2 tip to offer your fellow DBAs and developers? The best tips submitted will receive a cool prize -- submit your tip today!
  • Ask the Experts: Our SQL, database design, Oracle, SQL Server, DB2, metadata, and data warehousing gurus are waiting to answer your toughest questions.
  • Forums: Ask your technical DB2 questions--or help out your peers by answering them--in our active forums.
  • Best Web Links: DB2 tips, tutorials, and scripts from around the Web.

Dig Deeper on DB2 UDB (universal databases)

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.

-ADS BY GOOGLE

SearchDataCenter

Close