Why DB2 is better than VSAM

Why DB2 and why not VSAM? This article aims at answering this query and provides sufficient information that would let the designer take the right decision.

Every now and then, mainframe designers face the question of why DB2 and why not VSAM? This article aims at answering this query and provides sufficient information that enables the designer to make the right decision.

This article is split into three major sections:

  • The basic difference between a DBMS and a file management system
  • A comparison of VSAM and DB2 with respect to various features
  • The benefits of DB2 over VSAM

DBMS vs. file management systems

Basically, a file management system is that component of the overall system that manages stored files. Loosely speaking, therefore, it is "closer to the disk" than the DBMS is. In fact, the DBMS is typically built on top of some kind of file manager. Thus, the user of a file management systems will be able to create and destroy stored files and perform simple retrieval and update operations on stored records in such files.

In contrast to the DBMS, file managers:

  • are not aware of the internal structure of the stored records, and hence cannot handle requests that rely on a knowledge of that structure (such as "find all employees with salary less than $50,000").
  • provide little or no support for security and integrity rules
  • provide little or no support for recovery and concurrency controls
  • have no true data dictionary concept
  • provide much less data independence than the DBMS does

(Note: The above details are from "An introduction to Database Systems" by C.J. Date.)

DB2

DB2 is an abbreviation for IBM Database 2 and was launched in June 1983 as a subsystem on MVS that allowed MVS users to build, access, and maintain relational databases using the well known Structured Query language (SQL).

Since then, DB2 has come a long way and provides facilities to exploit the latest hardware and software technologies, accommodating a majority of user requirements. The latest versions are available on almost all platforms, including Windows, HP-UX, Sun Solaris, Unix, AIX, NUMAQ, Linux, AS/400 and OS/390.

As the name suggests, DB2 "Universal Database" provides universal data types, universal integration, universal access from clients of all types, universal applicability (for all types of applications), universal scalability (across all types of platforms), universal reliability (for non-stop 24/7 processing) and universal manageability.

The ability to manage many concurrent users, very large databases, high transaction rates and deliver consistent rapid response is fundamental and delivered by DB2 through the wide range of platforms and the exploitation of platform-specific features. Beyond this, DB2 meets the requirements for high availability, low planned maintenance, wide connectivity, open standards and effective manageability.

VSAM

VSAM is a high-performance access method used in the MVS, OS/390 and VSE/ESA operating systems. It was initially released by IBM in 1973 and is part of the Base product.

VSAM provides a number of data set types or data organization schemes. They are:

  • Key-sequenced data set (KSDS)
  • Entry-sequenced data set (ESDS)
  • Relative record data set (RRDS)
  • Variable-length relative record data set (VRRDS)
  • Linear data set (LDS)

Installations have been using VSAM data sets to hold more and more of their data to the point where many have reached the 4-gigabyte architectural limit for the size of VSAM data sets. Beginning with DFSMS V1.3, you can create and use VSAM KSDSs that can be much larger than the 4-gigabyte limit imposed on any VSAM data set defined before this release. DFSMS V1.5 allows non-KSDS file types (ESDS, RRDS, VRRDS and LDS) to exceed 4 gigabytes.

VSAM record-level sharing (RLS) was introduced to provide the value of the Parallel Sysplex to the existing applications. RLS itself does not provide transactional recovery. CICS provides a file access interface on top of VSAM. It is a CICS file control function that includes transactional recovery for VSAM files. This isolation and rollback capability enables VSAM data to be shared among CICS applications.

Comparison of DB2 and VSAM

Feature

DB2

VSAM

Hardware Independence

 PC to mainframe

 Only Mainframe

OS Independence

 NT, Unix & OS/390

 Only OS/390

Vendor Independence

 RBDMS with ANSI std.

 Only IBM

Scalability

 PC to mainframes

 Upto 4000 terabytes for LOB

 Only Mainframe

 Maximum size is 128 terabytes

Ease of development

 Standard SQL

 Stored procedure & triggers

 Not so simple

 No such option

Ease of maintenence

 Standard SQL

 Difficult

Security

 High degrees of security

 Only at Dataset level

Referential Integrity

 DB2 enforces it

 Manages even externally stored data

 Developers responsibility

 Not applicable

Query Interface

 Easy to view/modify

 Not available

Products/tool suite

 IBM & 3rd Parties

 Not available

Data Capacity

 254 times largest VSAM

 Limited to 2 terabytes

Data sharing

 Across CICS, IMS, Batch, TSO

 Very limited support

Web & Java support

 JDBC, SQLJ, Net.data

 Need custom interfaces

Distributed environment

 Consistent across platforms

Stored procs reduce network traffic

 Only Mainframe

 Not applicable

XML support

XML extenders

 Not supported

Performance

 For less data

Better for large data

Optimizer handles

Partitioning improves performance

 Better when data is less

 For less data

 Developer responsible

 No partitioning

Performance Tuning

Can be tuned anytime

Writes SMF records

Can be at SQL level

Tools available for aiding

Subsystem level tuning possible

Abundant tuning skills

 Depends on initial design

 No SMF records

 Only application level

 No tuning aids

 Not a subsystem

 Tuning skills are rare

CPU & IO parallelism

 Scanning is Faster

 No parallelism

Parallel Sysplex

 Can participate

Optimizer handles

 Can participate

 No optimization

Reorganization

 Direct reorganization

 Online reorg possible

 Parallel reorg

 Delete &  recreate

 Downtime needed

 No parallelism

Recovery

 Managed by DB2

 Always recoverable

 From log / backup

 Auto Recovery

 Parallel recovery

 Managed by CICS/IMS

 No recovery in batch

 From backup only

 Manual Restore

 No parallelism

Backup

 Online backup possible

 Incremental backup

 Parallel backup

 Downtime needed

 No incremental backup

 No parallelism

Availability

 Parallel reorg, backup

 Online reorg, backup

 Less downtime

 No online maintenance

 No parallelism

 More Downtime

Disaster Recovery

 Supported by DB2

 Part of DASD recovery

Data Archival

 Selective archival

 Selective retrieval

 Upto row level archival

 Specific Products

 No Selective archival

 No Selective retrieval

 Dataset level archival

 Dataset Migration

Personnel

 IBM & 3rd party training

 Easy to find Skills

 Reuse any RDBMS skill

 Same across platforms

 Not much training

 Scarce skill

 VSAM Specific skill

 Mainframe Specific skill

Data Warehouse

 Real time updates

 Direct Propagation

 Product suites available

 Batch updates only

 Extract & transform

 Not suitable for warehouse

Data types

 Images, Video, Audio etc

 Contents can be in file

 Text only

 No such option

Why DB2 is better than VSAM

  1. DB2 provides high level of scalability, ranging from workstation to mainframe, and is available on a wide range of platforms including Windows, HP-UX, Sun Solaris, Unix, AIX, Linux, AS/400 and OS/390. VSAM is tightly coupled with the mainframe and hence has a restricted choice of platform. In addition, DB2 is an RDBMS and so it is easier to migrate to a different RDBMS -- e.g., Oracle.
  2. DB2, being an RDBMS and following the ANSI SQL standard, has the distinct advantage of simplicity, richer functionality and ease of maintenance. With DB2, as the business process changes, modifying the database (such as adding or removing columns) is possible. Referential integrity of data across tables is taken care of by DB2 itself. Cost of maintenance of VSAM applications is more since even inserting or deleting a field at a later stage is difficult.
  3. DB2 provides a high degree of security in the sense that the unit of data that can be individually protected ranges all the way from an entire table to a specific data value at a specific row-and-column position. Different users can have different privileges on the same object. The view mechanism can be used to hide sensitive data from unauthorized users. The DBA can control access privileges by GRANT or REVOKE commands. In VSAM the security options are fairly limited and can be only at Dataset level.
  4. DB2 maintains referential integrity based on the rules specified by the application developer. IBM's DataLinks technology manages the integrity of externally stored data even when maintained by external applications. In case of VSAM, maintaining referential integrity becomes the application developers' responsibility. 5. It is easy to view or modify data with DB2 databases using facilities like QMF, SPUFI and DB2 forms. Using simple SQL, both standard as well as ad-hoc reports can be generated through QMF. DB2 also delivers significant query and client interface functions such as Net.Data, Visual Explain, DB2 installer, DB2 warehouse manager and control centre.
  5. DB2 has support of a rich suite of tools/products for the whole range of activities like administration, management, data manipulation, data replication, data warehousing, performance monitoring, archival and report generation. These products are from IBM as well as other third parties like Platinum, BMC, etc.
  6. DB2 can handle unlimited data. A DB2 V6 partitioned table can have 254 partitions, with each partition a VSAM dataset. So DB2 can be 254 times larger than the largest possible VSAM data set.
  7. DB2 enables applications to share the database across multiple environments comprising multiple CICS regions, IMS, Batch and TSO. Better concurrency support is achieved by using row-level locking. VSAM in a non-RLS environment provides limited support for the sharing of data sets. It does not provide the functions that are required to enable multiple users to update a shared data set with complete integrity. With RLS, the new VSAM function provided by DFSMS/MVS V1.3, VSAM datasets opened in RLS mode can be shared among many applications running in many CICS regions. But still VSAM couldn't be used in multiple environments like CICS, IMS/DC, Batch and TSO simultaneously.
  8. DB2 is web-enabled with built in Java support. DB2 data can be accessed from various systems using standard TCP/IP, ODBC, X/Open CLI, JDBC and SQLJ. Products like Net.Data, Web forms, Web Focus provide rapid building of Web applications. Very strong, language independent, stored procedure support (including support for Java) allows business logic to be shared across all architectures. Thus you open the world of DB2 directly to the web for any application and usage. Generally VSAM data is accessible only by local applications and accessed via 3270 terminals only. For accessing VSAM from systems other than mainframe, custom interfaces needs to be built making it both an expensive and difficult option.
  9. Being a DBMS available on various platforms, DB2 promotes a distributed, wholly interconnected distribution environment (e.g., corporate headquarters might have DB2 on OS/390 with individual branches having DB2 on NT). DB2 Universal Database is identical across OS/2, NT and UNIX -- both server and workstation platforms. The built-in Data Propagator enables data distribution, data replication and workload distribution. IBM's DB2 DataJoiner provides two-phase commit consistency across DB2 on all platforms and most commonly found 'foreign' data sources. Update-anywhere replication is a standard feature of DB2. Existing skills can be deployed and utilized throughout the enterprise thus maximizing ROI and with fewer individuals.
  10. DB2's XML Extender provides new data types that let you store XML documents in DB2 databases as character data or stored as external files but still managed by DB2. Retrieval functions allow you to retrieve either the entire XML document or individual elements or attributes. XML is the standard for data interchange for the next generation of business-to-business e-commerce solutions. With the XML Extender for DB2, it is easy to leverage critical business information in DB2 databases to engage in business-to-business solutions using XML based interchange formats.
  11. VSAM performs better than DB2 when amount of data touched by a task is less. But DB2 can perform better than VSAM when large amount of data is used by application. This is generally due to efficient data pre-fetch mechanism and buffer management. For each request from the user, DB2 optimizer chooses the efficient way to implement that request. As the user just specifies what data he wants rather than how to get that data, DB2 is able to relieve the application developer from certain level of performance tuning.
  12. DB2 has unique and mature tools, including graphical, workstation-based versions, to model, monitor and tune performance: VisualExplain, DB2 PM and OMEGAMON. Aided by these tools it is possible to do the tuning at various levels: system (bufferpool, EDM etc.), application, database or SQL level. Existing applications can be tuned to meet the growing demands of an organization: increase in the amount of the records to be processed, increase in the number of concurrent users etc. VSAM is not a subsystem and doesn't write performance records to SMF, which is used by performance monitoring tools. Performance of VSAM applications is heavily dependent on the initial design and there is very little scope of tuning later.
  13. In DB2, partitioned tables benefit from both CPU and I/O parallelism. Scanning mass amount of data from a DB2 partitioned table is faster than VSAM scanning due to the parallelism effect (assuming the system is multi-processor). VSAM cannot participate in either kind of parallelism (CPU & I/O).
  14. DB2 applications can be made to perform better by using appropriate partitioning features. Inter-query parallelization involves running many queries simultaneously. Intra-query parallelization divides potentially long-running queries into a number of components that are executed in parallel. Data partitioning across storage devices reduces the I/O bottleneck of large queries. VSAM doesn't have any option of application accessing the data in parallel.
  15. Although VSAM is enabled to participate in a parallel Sysplex complex, DB2 is more suitable because of its data sharing, data partitioning and two-phase commit capabilities. Since DB2 places the updated data in the coupling facility, systems with invalidated local copies of the data may refresh their copy from the Group Buffer Pool structure in the coupling facility rather than from DASD. So there is virtually no increase in I/O rate for a DB2 data sharing workload. The DB2 Optimizer picks the optimal amount of parallelism for queries -- including parallelism across a Parallel Sysplex.
  16. DB2 utilities are available for direct reorganization of index and data. Organization of data is an important factor affecting IO and CPU time. Online reorganization can be done in DB2 whereby there is no need for application downtime. VSAM doesn't provide any direct utility for reorganizing. Reorganization is done as a series of steps: unload, delete, create and load using IDCAMS. Reorganization of VSAM data needs downtime of the application.
  17. A DB2 table is always recoverable in CICS, IMS/DC or RRS environment on OS/390. VSAM can only be recovered in CICS or IMS/DC environments. In batch, VSAM does not log its updates and therefore does not provide a facility to back out of changes. So to recover your VSAM data in batch, you must restore it back to the last backup.
  18. DB2 has a recovery utility. Data can be recovered from log, backup copy or both. Recovery is better managed and faster, as backup data is also maintained by DB2 and taken in DB2 format. In DB2, data recovery will be faster for partitioned tables as the partitions could be recovered in parallel. In the case of VSAM, data can only be recovered from a backup copy. The utilities ADDRSSU or IDCAMS can be used to take timely backups and restores when required. But maintaining different versions of backup is a manual process.
  19. In DB2, partitioned tables can be backed up in parallel for all partitions thereby reducing the time taken to complete the backup. Also, the incremental backup of DB2 reduces the time needed for backups. DB2 database utility Concurrent Copy improves data availability. The VSAM dataset cannot be backed up in parallel and hence needs more time for backup processes. Moreover, there is no concept of incremental backup for VSAM datasets.
  20. DB2 has intelligent installation assistants and graphical administration tools that integrate with systems management products to reduce the overall cost of ownership. Overall, DB2 improves efficiency, reduces operating costs and supports continuous availability. VSAM requires more time for backup, restore and reorganization of data and thus needs more time for maintenance, reducing the total availability time of the application.
  21. DB2 supports disaster recovery. A disaster recovery copy of data can be easily identified by DB2. There is no separate disaster recovery mechanism for VSAM. It goes along with the DASD data recovery process.
  22. Database growth leads to a bloated database, affecting performance and availability, necessitating the need for an archival strategy. With DB2, using products like IBM's Row Archive Manager or Softech's Archive for DB2, selective archival and restoration of cohesive subsets of data can be done at the row level. In case of VSAM this would involve complex customized coding.
  23. A whole range of support and training for DB2 is available from IBM as well as third-party vendors. Personnel with DB2 skills in administration and performance tuning is easy to find and train whereas VSAM tuning skills are scarce. As DB2 is an RDBMS using ANSI standards, it is easy to train personnel who have worked in any other RDBMS like Oracle, SQL server, etc.
  24. If the enterprise data server is in DB2, options exist for both batch as well as real-time updates to data targets (data warehouses, data marts, etc) with continuous availability, making implementing a data warehouse solution a relatively straightforward option. If the enterprise data is in VSAM, building and maintaining a data warehouse is a multi-step process. This reduces the ability to exploit fully the wealth of information they contain.
  25. DB2 data warehouse products include DB2 OLAP Server, Intelligent Miner, and Visual Warehouse. DB2 for OS/390 also has proven scalability to multiple terabytes of data and hundreds of concurrent users in data warehouse environments.
  26. DB2 supports stored procedures, triggers and user-defined functions in a number of languages, including Java. With each new release of DB2, performance improvements and additional features are available to the users.
  27. DB2 includes a set of extended data types to be able to store, present, manipulate and search on a range of extended data types such as image, audio, video, extended text, geographical, and others. DB2 stores these as large objects BLOBs or CLOBs. The DB2 extenders give the added flexibility of keeping the content of a LOB in a file and pointing to it from the database.

In summary, DB2's ability to efficiently manage many concurrent users, very large databases, and high transaction rates on multiple platforms makes it a better choice than VSAM in most situations.

About the author
Sasirekha Cota is a Senior Architect in Tata Consultancy Services with 12+ years of experience in both mainframe and client/server development. She has experience mainly in the financial domain and in a wide range of technologies including mainframe, J2EE and Windows. She has played the role of Architect, Technology Mentor and Project Manager. As a senior architect, she has been working with customers from the US, Europe and Asia.

This was first published in February 2005

Dig deeper on DB2 UDB (universal databases)

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchEnterpriseLinux

SearchDataCenter

Close