If you want to know more about the advantages of running SAP in IBM i, please reach out to Kolby at: hoelzle @ us.ibm.com
_______
The Importance of Understanding Your Data
Your repositories of data are growing. The transactions
and events that produce that data are increasing. The requirement to extract
more value from the data is not going away. Thus, the importance of
understanding your data and how it is being used is paramount.
Understanding the data and its life cycle is
traditionally the responsibility of the database engineering team.
Unfortunately, IBM i installations usually do not have a designated or
dedicated DB2 for i database engineer. This puts the organization at a
disadvantage. The good news is that this situation can be remedied relatively
easily. For more information about the DB2 for i Database Engineer position, look here.
The process of learning more about data and the SQL
statements accessing that data are accomplished through profiling. Proactive
data profiling is better than reactive data profiling. Data profiling involves
capturing and analyzing the metadata associated with the DB2 for i database
objects as well as the SQL statements issued against those objects. Metadata is
defined as “data about data”.
Some of the more interesting and useful database object
attributes (i.e. metadata) to understand include table size, number rows
(active and deleted), index size, column value cardinality and distribution.
Profiling and analyzing this information includes identifying the top n largest
tables and largest indexes, as well as their respective growth rates. While not
well known, there are limits to the size of DB2 table. If the limit is reached,
no more data can be put into the table, and needless to say, your business
transactions stop. Knowing the limits, and more importantly understanding the
rate at which you are approaching the limits, is how you avoid catastrophe.
A good query execution plan is dependent upon the query
optimizer understanding the data in the tables that are referenced. Local
selection method, join technique and grouping scheme are the best they can be
when column value cardinality and distribution are well understood. For the
database engineer to understand and accept the query optimizer’s decisions, he
or she must also understand the data. For example, does a particular column
value represent a small number of rows in the data set, or a large number of
rows in the data set?
Comparing metadata points and object attributes over time
provides trending information. This type of information allows for some very
interesting and meaningful analysis; and if done proactively, can assist the
business and technical leaders with planning and forecasting. To accomplish the
trending analysis, the information captured from the DB2 for i database objects
and SQL statements must be stored in a structured and organized way. In other
words, the various metadata points must be available over time to facilitate
comparisons, calculate deltas and connect the dots to produce a graph or trend
line. This means capturing the metadata on a scheduled and frequent basis and
keeping the data over time for reporting.
Given that the target database housing data from business
transactions resides in the live “production” system, the techniques and
methods used to capture and store the DB2 for i metadata need to be fast,
efficient and concurrent. Harvesting data about data cannot interfere with the
critical operations. On the other hand, if you do not understand the data that
represents your business, what’s the point of capture and analysis?
A suitable architecture for capturing, storing, analyzing
and reporting on the metadata looks a lot like business intelligence
architecture. As such, the data model, ETL and analysis mechanisms must be well
thought out, designed and implemented properly. This includes using a separate
(and likely smaller) system to hold the data and provide a platform for query,
analysis and reporting engines.
While all of the capturing, analyzing and reporting on DB2 for i metadata seems intriguing and valuable, there is still the work involved to get the proper process and procedures in place. It just so happens that the IBM DB2 for i Center Excellence team has a solution available that can be used by the database engineers to facilitate the understanding of data.
For more information on the DB2 for i metadata reporting solution and/or the importance of the DB2 for i Database Engineer, please reach out to me.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.