Wednesday, September 23, 2015

Protect your analytics data and your bottom line

I ran across a new paper published by my good friend and database whiz kid Shantan Kethireddy. Some of you may know or at least remember Shantan from his days as a hotshot DB2 for i query optimizer developer and dynamic technical speaker. For the past several years he has been spending his time and energy extolling the virtues of gaining business intelligence from DB2 for z/OS environments.

Shantan's perspective and unique insight really resonated with me. So much so, I read the paper 3 times (I know, I'm slow on the uptake, as in, never was a whiz kid).

The first time I read the paper, I was like - what's this guy talking about.

The second time I read the paper, I was like - hey, this guy is on to something.

The third time I read the paper, I was like, let me replace "DB2 for z/OS" with "DB2 for IBM i".

I wanted to see if the points apply to my IBM i clients around the world.  The answer is YES!  Many, if not all of the observations and recommendations apply.

I suggest you take a few minutes to read the paper, think about it, and share it with your executives and business leaders. This is important stuff.

If you want to discuss when, where and what might apply to your information management environment, please reach out to me, or talk to Mr. Kethireddy.


Could your analytics strategy cost your business USD 100 million?

Technology trends and forces such as cloud, mobile and big data can represent big opportunities to bring analytic insight to the enterprise. They can also represent big risks if proper data security and governance controls are not in place. In 2015, one of the largest health benefits companies in the United States reported that its systems were the target of a massive data breach. This exposed millions of records containing sensitive consumer information such as social security numbers, medical IDs and income information. Various sources, including The Insurance Insider, suggest that this company's USD 100 million cyber-insurance policy would be depleted by the costs of notifying consumers of the breach and providing credit monitoring services—and that doesn’t consider other significant costs associated with a breach such as lost business, regulatory fines and lawsuits.

Data is now so important, it has a value on the balance sheet.  Cyber criminals know this. Without exception, every industry has been under attack and suffered data breaches – healthcare, government, banking, insurance, retail, telco. Once a company has been breached, hackers focus on other companies in that same industry to exploit similar vulnerabilities. In 2015 the average cost of a data breach was US$ 3.79M, causing long term damage to the brand, loss of faith and customer churn. 

As you think about the impacts of this and other data security breaches occurring at organizations worldwide, consider this question: how exposed is your business to a similar type of breach? To answer this question, you must first ask, “Where does the data that feeds our analytics processes originate?”

See the full paper here

Monday, August 17, 2015

The Importance of Understanding Your Data

I originally published this article in the SAP for IBM i newsletter produced by Kolby Hoelzle, Practice Leader - SAP on IBM i Center of Excellence.

If you want to know more about the advantages of running SAP in IBM i, please reach out to Kolby at: hoelzle @


The Importance of Understanding Your Data 

Your repositories of data are growing. The transactions and events that produce that data are increasing. The requirement to extract more value from the data is not going away. Thus, the importance of understanding your data and how it is being used is paramount.

Understanding the data and its life cycle is traditionally the responsibility of the database engineering team. Unfortunately, IBM i installations usually do not have a designated or dedicated DB2 for i database engineer. This puts the organization at a disadvantage. The good news is that this situation can be remedied relatively easily. For more information about the DB2 for i Database Engineer position, look here.

The process of learning more about data and the SQL statements accessing that data are accomplished through profiling. Proactive data profiling is better than reactive data profiling. Data profiling involves capturing and analyzing the metadata associated with the DB2 for i database objects as well as the SQL statements issued against those objects. Metadata is defined as “data about data”.

Some of the more interesting and useful database object attributes (i.e. metadata) to understand include table size, number rows (active and deleted), index size, column value cardinality and distribution. Profiling and analyzing this information includes identifying the top n largest tables and largest indexes, as well as their respective growth rates. While not well known, there are limits to the size of DB2 table. If the limit is reached, no more data can be put into the table, and needless to say, your business transactions stop. Knowing the limits, and more importantly understanding the rate at which you are approaching the limits, is how you avoid catastrophe.

A good query execution plan is dependent upon the query optimizer understanding the data in the tables that are referenced. Local selection method, join technique and grouping scheme are the best they can be when column value cardinality and distribution are well understood. For the database engineer to understand and accept the query optimizer’s decisions, he or she must also understand the data. For example, does a particular column value represent a small number of rows in the data set, or a large number of rows in the data set?

Comparing metadata points and object attributes over time provides trending information. This type of information allows for some very interesting and meaningful analysis; and if done proactively, can assist the business and technical leaders with planning and forecasting. To accomplish the trending analysis, the information captured from the DB2 for i database objects and SQL statements must be stored in a structured and organized way. In other words, the various metadata points must be available over time to facilitate comparisons, calculate deltas and connect the dots to produce a graph or trend line. This means capturing the metadata on a scheduled and frequent basis and keeping the data over time for reporting.

Given that the target database housing data from business transactions resides in the live “production” system, the techniques and methods used to capture and store the DB2 for i metadata need to be fast, efficient and concurrent. Harvesting data about data cannot interfere with the critical operations. On the other hand, if you do not understand the data that represents your business, what’s the point of capture and analysis?

A suitable architecture for capturing, storing, analyzing and reporting on the metadata looks a lot like business intelligence architecture. As such, the data model, ETL and analysis mechanisms must be well thought out, designed and implemented properly. This includes using a separate (and likely smaller) system to hold the data and provide a platform for query, analysis and reporting engines.

The results of basic analysis and trending calculations are reviewed, augmented and used by the DB2 for i Database Engineering team. The goal is to assist business leaders in maintaining a viable and valuable data centric environment. In essence, this answers the question: who should understand the data? Furthermore, it is the responsibility of the database engineers to formulate a plan of action to resolve any issues that appear on the horizon BEFORE they become critical. 

While all of the capturing, analyzing and reporting on DB2 for i metadata seems intriguing and valuable, there is still the work involved to get the proper process and procedures in place. It just so happens that the IBM DB2 for i Center Excellence team has a solution available that can be used by the database engineers to facilitate the understanding of data.

For more information on the DB2 for i metadata reporting solution and/or the importance of the DB2 for i Database Engineer, please reach out to me.


Monday, August 10, 2015

XML: Moving On and Moving In

My Center of Excellence team mate and Wizard of Oz Query Optimization Rob Bestgen has an important message to share...


Almost a decade ago IBM came out with a set of extensions, or "extenders", for DB2. These extenders brought more capabilities to DB2 than existed in DB2’s core capabilities at the time. 

Two extenders features were distributed, depending on OS release, under licensed programs 5722-DE1, 5761-DE1 or 5770-DE1:

  • DB2 Text Extender/Text Search Engine
The Text Extender provided a way to index and search unstructured text, such as flat files and documents, using SQL as the interface.  Since potentially lots of data in any given company resides outside the database e.g. as IFS files, this extender provided a good way to track and search that data.

  • DB2 XML Extender
The XML Extender provided the capability to create XML documents from DB2 data and to annotate (extract data from) XML documents and store the extracted data into DB2 tables.  Since XML is a common way to do EDI between companies and agencies, this extender was a natural fit for creating XML documents for distribution to and for processing XML documents received from other companies and agencies.

Since the time when the Extender product first came out, the DB2 family members have added the necessary support that makes these extender features no longer necessary. This includes our favorite DB2 database, DB2 for i.

In the case of the Text Extender feature, the Omnifind product (5733-OMF) became available as a no charge licensed program product available for IBM i 6.1 and later. Even though the Omnifind product is a separate install from the OS, its runtime is integrated into the database for optimal performance.

For the XML Extender feature, IBM added integrated XML support into DB2 in IBM i 7.1. This XML support, which is integrated right into the DB2 core, provides a super set of the capabilities that were available in the XML extender.

Regarding the older the extender product, effective August 16, 2013 it was withdrawn from marketing, but continued to be supported for those customers using one or more features.

Well, that support is coming to an end in September of this year. 

The end of support timing by the way, coincides with the end of support for IBM i 6.1 OS, though the OS support can be extended (for a fee of course.) Unlike the OS however, there is no support extension coming for the extender product.

Now, for some customers, the end of support for the extender product may not be a big deal. These customers will just continue to use the product features.  However, starting in IBM i 7.2, the Extender product is not available at all.

Therein lays the potentially big issue. An application using a feature of the Extender product could be in for a rude awakening when migrating to IBM i 7.2 (i.e. when the applications stop working). Given the number of calls and questions that IBM has been receiving recently, clients appear to be finding this out the hard way.

So if you are a user of the DB2 Extender product, or suspect you are, what should you do?

The good news is that migrating from an Extender product feature to the corresponding alternative is available and proven.

For the Text Extender, a wealth of Omnifind documentation can be found here. 

For the XML Extender, a whitepaper exists on just the topic of moving from XML Extender to DB2 XML and that can be found here.

Oh by the way, the DB2 for i SQL Reference is also a great place to learn more. The 7.1 flavor can be found here, and the 7.2 flavor here.

With that said, the migration from an Extender feature is not seamless and carefree. The Extender features were written before SQL standards were in place and, as might be expected, their interfaces are different from what was eventually put into the database. For example, the XML Extender feature made heavy use of DTDs while, for DB2 integrated XML, XML SCHEMAs are the standard. In short, some amount of  "effort" is needed to do the migration.

If the thought of pursuing the migration from Extenders seems overwhelming, or you are unsure how exposed you might be, there’s more good news. The DB2 for i Center of Excellence stands ready to help you with the migration.  Our team has the knowledge and skills to assist you with evaluation, guidance and prototyping as needed.

The bottom line is this: if you are using the DB2 Extender product, it’s time to move on past the product and move into the integrated support of DB2.


Thanks Rob!  You have illuminated a dark and stormy issue for folks migrating their "extended" database applications to 7.2. 

Tuesday, July 28, 2015

Treading Water, Sharks Circling

This post is brought to you by my long time teammate and good friend Doug Mack. Besides his many other areas of expertise, Doug leads the business intelligence / analytics portion of our DB2 for i Center of Excellence.


It seems like sharks are everywhere in the news these days, with all the reported sightings and of course the Sharknado movies (an interesting concept by the way!). The data centric world is like that too... sometimes you get the feeling of being chased by "data sharks"!

It occurs to me that if you are your company’s “go to person” for data (think urgent RPG coder or Query/400 report writer), you’re probably doing all you can to keep swimming faster than the business analyst “sharks” requesting that slab of data! While it's completely understandable that the analysts urgently need this data to sense what's going on in the business, you’re just trying to keep your head above water.  

It doesn’t have to be this way. While it might take some disciplined project planning and work to get there, you can alleviate YOUR workload and create an environment that either allows end users to run their own reports whenever they want (this can be called “Self Service BI”) or you can automate the execution and publishing of reports and dashboards, including visualization on their handheld device of choice. Now you can sit back on the beach with a tropical umbrella drink while the sharks wander off.

Trying to tackle data analysis has been around since, well, Query for the System 36! But in our team’s experience, many folks are still sinking under the burden of older tools and processes that prevent the kind of buoyancy and agility required to meet business demands. Lacking a suitable budget for big bang projects doesn’t help either.

To this end, IBM introduced DB2 Web Query some time ago. While it’s true you can use it to modernize and re-engineer your Query/400 reports - that is really underselling Web Query’s capabilities. You can not only create that self-service environment and streamline report publishing; you can also address data transformations that exploit this tool at a whole new level. With the introduction of DB2 Web Query DataMigrator ETL Extension, clients can now automate the often difficult task of consolidating, transforming, and optimizing the data for business intelligence applications. This supports the creation of a data mart or data warehouse that completely isolates and secures the data.  In many of the client situations we encounter, the data anomalies that make analytics difficult can be simplified with a data mart architecture and ETL (Extract, Transformation, and Load) tool.

Did you know that since 2008 IBM has shipped at least a limited use license of DB2 Web Query to almost every IBM i customer. But in many cases, those licenses are still at Version 1 of the product, which was essentially replaced with Version 2 in 2012. And just recently, IBM introduced the latest version, 2.1.1. In other words, you very likely own licenses to an older version of Web Query, but you’re missing out on some tremendous enhancements, including wizard analytics, geographical map visualizations, personal dashboards, mobile device support, and the aforementioned DataMigrator ETL extension.

Want to get some shark repellent? Why not take a look at some of the features of DB2 Web Query V2.1.1, and learn how to create that optimized “self service” environment to keep the frenzied school at bay!

To learn more about V2.1.1, check out the end user and wizard analytics video demonstrations here.

To find out how to acquire, install, and set up the product, you can also view the set of getting started videos on that same wiki page! 


Thanks Doug! And let me add, if you are interested in making a difference with data sooner than later, consider setting up that sandbox I talked about.

Monday, June 15, 2015

Time for a New Challenge

My long time teammate, prolific writer and periodic DB2 blogger Kent Milligan has an exciting announcement to share...


After spending the first 25 years of my IT career working on the IBM i, iSeries, and AS/400, I’ve decided that it’s time for a new challenge and I have accepted a new position working for the IBM Watson Group. It was a difficult decision to make because the IBM i community - both inside and outside of IBM – has been a great family to be part of.

It’s been very rewarding to work with clients all over the world for the past 25 years and help them use DB2 and SQL technologies to solve real business problems. I’m thankful for these real world interactions that have increased my knowledge base beyond just DB2 by all of you opening my eyes to all aspects of the IT solutions that companies depend on.

Although I’m sad about leaving the IBM i world, I am excited about applying my data centric skills and knowledge to solving a different type of problem – improving the quality of care and health of patients fighting diseases. Specifically, I’ll be working on the Watson for Clinical Trial Matching solution which helps a doctor sift through the pool of available cancer trials and quickly identifies potential matches for a patient. One of my first projects will be helping add support for new diseases to the Clinical Trial Matching product.

As I start the challenge of trying to get up to speed on a new technology, cognitive computing, I ironically find myself following the same advice that my “old” DB2 for i Center of Excellence team has been recommending to IBM i clients.  Build knowledge by getting educated by experts, start with a small project to gain experience, and iterate. 

I’ll still be wandering the halls of the Rochester Lab, so I plan to regularly check up on all of you to make sure that you’re “doing more with data” on IBM i by leveraging the latest SQL and data-centric technologies. So, goodbye and keep up the good work!


I would like to formally thank Kent for his contributions to the success of our clients and our team. We are all better because of him!

Wishing you all the best with your next challenge my friend.

Friday, June 5, 2015

Yes, You need a Sandbox

I recently ran across a very insightful article By Bob Becker of the Kimball Group. If you have not heard of "Kimball", as in Ralph Kimball, he is arguably one of the best known founding fathers of data warehousing and the data model known as "star schema". Mr. Becker's article caught my attention because it asks and answers the question "Does your organization need an analytical sandbox?".

Take a moment to go read the article here.  Then come back and let's think more about it.


After my first read through the article, my reaction was: "of course!", and then I thought about all of my IBM i clients around the world and wondered if this applies to them, and if so how would it apply. And maybe more importantly, can this idea be simplified to help move an IBM i shop forward towards providing more value sooner than later.

The first observation that caught my eye is this:

" today’s competitive world, organizations need to be more nimble. They want to quickly test new ideas, new hypotheses, new data sources, and new technologies."

This echoes what my team has been saying to executives and technical leaders for many years. You must establish an organization that uses modern tools and modern methods to develop applications that are flexible, extensible, scalable and timely. Employing best practices in the science and art of DB2 data centric design and development is the foundation.

The second observation that caught my attention is this:

"A key objective of the analytic sandbox is to test a variety of hypotheses about data and analytics. Thus, it shouldn’t be a huge surprise that most analytic sandbox projects result in “failure.” That is, the hypothesis doesn’t pan out as expected. This is one of the big advantages of the analytic sandbox. The data utilized in these 'failures' didn’t and won’t need to be run through the rigor expected of data contained in the data warehouse. In this case, failure is its own success; each failure is a step towards finding the right answer."

Folks, this is profound. It is what I refer to as "lose to win".  Think about it. Better yet, do it.

The third observation that resonated with me is this:

"Having the right skills in house is critical to the success of the analytic sandbox. The users of the analytic sandbox need to be able to engage with the data with far fewer rules of engagement than most business users. They are users capable of self-provisioning their required data whether it comes from the data warehouse or not. They are capable of building the analytics and models directly against this data without assistance."

One topic I continue to shout out from my soap box is this: you absolutely need a DB2 for i database engineer.  Better yet, you need a database engineering team. These are the men and women who will have the skills, knowledge and mission to use the sand box effectively and efficiently - with a positive ROI.


During the iAdvocate session I hosted with Jon, Susan and Paul at the Spring 2015 RPG & DB2 Summit conference, we tossed around the idea of using a skunkworks project to learn, innovate and demonstrate value sooner than later. We also discussed the benefits of keeping such a project initially off the radar screen and NOT asking for permission. Once there is a eureka! moment, the team can go into marketing mode and share the results and/or show off the prototype - thus winning support and moving into the realm of official projects.

Granted, what I just said is not necessarily natural to most IT folks, and there is likely some new skills required - but hey, this is what we are here for.

One alarming trend I also shared with the Summit conference audience is that traditional IT organizations are increasingly being bypassed. The business units are embarking on data centric projects without assistance from IT. While this phenomenon is a topic for another time, let's just say that bypassing IT, and the wisdom embodied within, is not likely a good long term strategy. Never-the-less, it is happening. Why? In my opinion, it's because IT is not meeting the real or perceived requirements of the the business units. How about we change that...


So, what's my simple idea?

Build out a small, modern and independent DB2 for i sandbox.

Figure out how to buy, beg, borrow or steal some newer IBM i infrastructure that is outside of the normal control and governance scheme. By the way, this is NOT a development box. I consider designing and developing code a "production" activity critical to the day to day business operations, and as such, the development system is part of production. We don't want to mess with that.

Obtain and install the latest version of DB2 for i (that would be what's inside IBM i 7.2 + TR2), stand up some modern development tooling (IBM Data Studio for example), and add in DB2 WebQuery plus the DataMigrator ETL extension.

Get educated on relational database fundamentals (i.e. data modeling, SQL DDL and DML, set-at-a-time, etc.) and become familiar with DB2 Web Query and basic ETL.

Now, here's the fun part... go buy one of the business users a cup of coffee. Find out what they are doing with the data they are extracting from the production system. You know, the big data set that is pulled down to their PC periodically via QUERY/400 and manipulated with Excel. Learn their process for analyzing and formatting the data. Then go back to your DB2 for i sandbox and experiment. Try to replicate what they are doing.

Fail. Try again. Fail. Try again. Eureka! it works.

Now, enable the information to appear on a tablet. Congratulations, your information is mobile.

Ok, time to get some informal but important feedback...

Go buy the business user another cup of coffee and show off the results of your experiment. If they are excited about your prototype, encourage them to spread the news. If they provide some meaningful suggestions for change, take it and iterate. This will help to strengthen the new relationship and build trust, not to mention giving you more opportunity to learn and practice.

The bottom line is this: be proactive, be a hero!


If you want to discuss the "sandbox" idea further, or if you need help it pulling it off, please let me know.  We are here to help make you successful, and more valuable!

Tuesday, March 24, 2015

Dealing with Blank Checks in DB2

"I think IBM i developers believe they must help DB2..."

hmmm, unfortunately this is sad but true.

The science of database must be understood before the art of database can be practiced. This includes the relatively simple task of comparing a column's value to a literal.  How hard can it be to compare and test for blanks? You know, the spaces sitting in a empty not null character column.

Well, apparently it's more difficult on some platforms than others, and SQL coders jump through hoops and attempt back flips to "help" the database engine.

For more information and coaching on the topic, please see Kent Milligan's excellent article on the topic here.

If you want to get better at understanding the science of DB2 for i, and proficient at practicing the art of relational database with IBM i, then please reach out - we're here to help you become successful.