Tuesday, April 12, 2016

Significant Upgrades to the Database Engine!


We quite often refer to DB2 for IBM i as the "database engine". On this particular day, I ruminate on why the word "engine" always seems to fit, and more importantly why it actually describes what you have sitting inside your IBM i system.

The basic definition of an "engine" goes something like this:

   A machine with moving parts that converts power into motion.

In computer programming, an engine is a program that performs an essential function or functions.

An engine can be a central focal point in a system, subsystem, or application. "Engine" can also be used to describe special-purpose algorithms. 

Hmm... this sounds a lot like the integrated relational database management system we call DB2 for i.

And for an even more interesting connection, consider that in 1844, Charles Babbage named his stored-program computer the "Analytical Engine"!

Why do we build and employ engines?

Engines provide us with many positive advantages:
  • Strength
  • Speed
  • Power
  • Efficiency
  • Repeatability
  • Predictability
  • Progress
All of which allows us to move forward in a direction and manner that is likely impossible without the engine.

Simply put, engines allow us to accomplish amazing things. Literally and figuratively speaking, engines push us forward by increasing our capabilities.

Personally, I'd like to add to the list of advantages, "lazy and easy".  As in, I can be lazy, because it's easy!


Your Data-Centric Development Engine


Whether we take the traditional definition or the computer science definition, DB2 for i surely represents an engine for innovation. But this benefit is manifested only when the engine is actually used.

When we talk about "data-centric" development, we are referring to designing, programming and implementing solutions in which the database engine plays a major role.

Technically speaking, data-centric development relies on using the advanced features and functions incorporated within DB2 for i, and generally made available via SQL.

Financially speaking, data-centric development means being more efficient and productive by making use of algorithms and capabilities that you already possess.

Assuming you could, why would you spend any time, energy and money reinventing advanced technology, when that technology is sitting in front of you, ready to use?

But wait! I don't see the database capability needed to help solve my business problem.

Look again...


Significant Updates to DB2 for i


With the April 12, 2016 announcement of IBM i 7.3, we have another set of exciting database features and functions at our fingertips. The enhanced IBM i 7.3 capabilities extend the benefits of data-centric development via SQL, DB2 for i, and IBM i running on Power.

For the past few years, my friends and family have been illuminating the importance of expanding your data-centric applications in 3 fundamental areas:
  • Security
  • Analytics
  • Mobility
No surprise, these areas represent common and high priority business requirements. They also represent "low hanging fruit" with regards to providing more value to your clients, customers and partners. And IBM i provides significant assistance in the pursuit of strategic initiatives involving the securing of data, analyzing of data and delivering that data to mobile consumers.

Keeping with the theme of our "database engine", I want to call out a couple of the areas that IBM i 7.3 can power your business forward.


Time travel with DB2 for i Temporal Support


The business requirement is straightforward:

Maintain all current transaction states, retain all previous (i.e. historical) transaction states, and provide the capability to easily retrieve the appropriate current and/or historical transaction state for a particular time period. Oh, and don't mess it up!  

Traditionally we have addressed the business requirement (i.e. the need for point in time data management) by implementing some sort of homegrown solution. These solutions require crafting time aware data processing logic in application code or customized procedures and triggers. This application-centric technique has many disadvantages:
  • Implementing time-aware processing logic with integrity and accuracy is very complicated.
  •  Application development and on going maintenance is costly and time consuming.
  •  Application performance and scalability is often not optimal.
  •  Due to a lack of standards, not to mention developer nuance, different applications will have different implementations of the time based processing logic.
Sounds like a great job for our database engine!

With 7.3, DB2 for i provides the following temporal capabilities:
  • Transparently maintain a system history of all data changes with system-generated time stamps, which are managed as system time periods
  • Allow applications or end users to formulate simple queries that access data as of any desired point in the past, with very high precision
Why Design and Implement a Temporal DB?

The coding of complex temporal operations via SQL is much simpler, which increases developer productivity and reduces the cost to implement and maintain time aware applications by up to 45 times, as shown by an IBM study. DB2 temporal data management helps improve data consistency and data quality across the enterprise.

Why Use a Temporal DB?

The use cases for DB2 temporal data management include the ability to track and analyze changes in your business by easily comparing data from two periods in time. It also represents increased accuracy in time based reporting allowing for tracing when a row was modified, as well as who modified it. This can provide a cost-effective means to address auditing and compliance issues. And with the appropriate data modeling, DB2 for i can also drive the capture of row history in an environment with slowly changing dimensions.

How Does a Temporal DB Work?

After creating a correctly defined "history" table like the base "current" table, there are 3 functions of the DB2 temporal engine:

  1. Maintain the existence, linkage and integrity of the history table
  2. Capture and store all of the before and after row states, over time
  3. Provide access to the current row and/or historical row(s), based on time period

A simple illustration...

After four transactions to a row (t1 insert, t2 update, t3 update, t4 delete) only the current state is available.


After four transactions to a row (t1 insert, t2 update, t3 update, t4 delete) both the current state and the previous states are available.



An important note!

Please do not confuse temporal database support with database logging (i.e. journaling).  They are two different things, with very different requirements and restrictions.  Simply put, database logging does not represent relational data that can or should be queried as a normal part of the business process. Rather, database logging primarily supports transaction isolation and recovery, and secondarily support change data capture in support of transaction replication. The difference seems subtle to be sure, especially given all of the various and historical uses of IBM i journals.



More Data Analysis with DB2 for i OLAP


It's all about the data.  Specifically, it's all about analyzing the data quickly and efficiently to provide information and insight that drive decisions and provide direction. Having a database engine that supports the ability analyze data quickly and efficiently is an advantage. DB2 for i provides such an advantage.

New OLAP Specification Extensions

In IBM i 7.3, the SQL OLAP specification is extended with support for a new class of calculations called moving aggregates. These specifications support important OLAP functionality in the database engine such as cumulative sums and moving averages by using a logical set of rows within a "window". A window specifies a partitioning of rows, an ordering of rows within partitions and an aggregation group (of rows). The aggregation group tells the database engine which rows of a partition set, relative to the current row, should participate in the particular calculation.

If you are not familiar with, or don't remember OLAP specification functions that have been available since 5.4 (RANK, DENSE RANK, ROW NUMBER), recall that OLAP specification functions compute a single value for current row based on some or all the rows in a defined group.

In DB2 for i, OLAP specifications provide the ability to return ranking, row numbering, and other aggregate function information as a scalar value in a query result. It's the "other aggregate function" piece that is new and improved with 7.3

Technically speaking...

An Ordered OLAP specification specifies operations that require a window-order-clause. The operations are...

Lag or Lead function
RANK ( )
DENSE_RANK ( )
NTILE ( expression )
CUME_DIST ( )

A Numbering specification specifies an operation that returns sequential numbers for each row. The operation is...

ROW_NUMBER

An Aggregation specification specifies a function that will compute a single value from the window. The normal functions are...

AVG, CORRELATION, COUNT, COUNT_BIG, COVARIANCE, COVARIANCE_SAMP, MAX, MEDIAN, MIN, PERCENTILE_CONT, PERCENTILE_DISC, Regression, STDDEV, STDDEV_SAMP, SUM, VARIANCE, VARIANCE_SAMP

The OLAP aggregate functions are...

FIRST_VALUE, LAST_VALUE, NTH_VALUE, RATIO_TO_REPORT

The OLAP window specification includes the following components:

  • Partitioning
  • Ordering
  • Aggregation group

The window-partition-clause defines the partition within which the OLAP operation is applied.

The window-order-clause defines the ordering of rows within a partition that is used to determine the value of the OLAP specification.

The window-aggregation-group-clause defines a set of rows that are used to perform the aggregation.

An important note!

In addition to the extended and enhanced OLAP specification, several new aggregation functions are available in DB2 for i 7.3. Combining these aggregation functions with OLAP specifications provides a very powerful set of capabilities all delivered by the database engine - without writing your own program. See the DB2 for i 7.3 SQL Reference for details.

A simple example using one SQL statement...

Problem: Show the difference in sales between the current store and the store with the best sales, the second best sales, and the worst sales.

Solution: An SQL statement with OLAP specifications to define the window and moving calculations.





In Conclusion...


Once again, the integrated database engine known as DB2 for IBM i provides powerful new capabilities that will allow you to extend and enhance your data-centric applications in an efficient and timely manner.  Take advantage of it!

You can find more details and examples from the subject matter experts using the following links:

DB2 for i Version 7.3 Overview
DB2 for i Wiki
DB2 for i - Technology Updates

And if you need assistance with getting more value out of your data sitting idle in IBM i, please do not hesitate to contact me.

2 comments:

  1. Hi Mike, Great new support. I had a question on temporal database support. Does the target table need to DDL defined or can be DDS defined. Can the Historic table be accessed via native io or just SQL access? I have no strong feelings either way just wanted to know the facts. Good work again, thanks George Jeffcock

    ReplyDelete
  2. Hi George, while a DDS defined PF can be altered via SQL to contain the appropriate columns and attributes for temporal, I prefer to take advantage of the opportunity to modernize and re-engineer the PF into a table.
    Time travel queries are only possible with SQL.
    You can open the history table and perform READs.

    ReplyDelete

Note: Only a member of this blog may post a comment.