Tuesday, November 8, 2016

The Data Model Matters

For a couple of years now, the major theme of our DB2 for i Technical Forums held around the world has been: design it, build it, secure it, tune it.

During the "design it" portion of the Forum, we focus extensively on the data model. No surprise. Our vast experience tells us that, when it comes to implementing a solution that is flexible, extensible, agile and scalable, the data model matters.

Historically speaking, the S/38, AS/400, iSeries and IBM i development communities have embarked on building data centric applications without the presence of a database administrator. While this has in some ways been an advantage, it has also resulted in a gap. Specifically, a gap in the understanding of the science and art of relational data modeling.

To fill this gap, we routinely recommend standing up an IBM i database engineer (DBE). One of the important responsibilities of the DBE, is to design, implement and support a proper data model.

When it comes to getting the most out of DB2 for i, the data model matters.  By telling DB2 about your data centric business rules, the data element attributes, and the relationships between sets, DB2 can do more work on your behalf. This results in higher productivity, greater efficiency and better performance.

Define your date column as a DATE!  By telling DB2 the column value represents a date, the database management system knows more and can do more. The SQL Reference is filled with date manipulation functions.  Why would you ever write one when it's already provided - assuming of course you define your data elements appropriately.

When it comes to formulating and executing SQL queries, the data model matters. SQL is set oriented. As such, the proper definition of sets and their various relationship(s) are fundamental to developing good set based operations. Sure, DB2 can overcome issues with a poorly defined and implemented data model; but at some point, the amount of work it takes to connect the dots to find the answer to your crazy procedural query will overwhelm everyone's patience and budget.

When it comes to handling very large data sets and over coming limits to growth, the data model matters.  Performance and scalability are two different things; and they do not necessarily go hand in hand. As the data set grows, it is not guaranteed that the performance and throughput will stay in the acceptable range. Relational data modeling best practices can mitigate the risk of growing data sets, as well as provide solutions to over coming database size limits and restrictions. Designing (the data model) for growth and scalability will also pay dividends when your business decides to makes an acquisition resulting in the inevitable large leap in data quantity and transaction processing volumes.

A sound data model also matters when it comes to data governance and control. In other words, properly securing your rows and columns is made much easier with a good data model. By applying relational concepts and normalization rules, the task of isolating and protecting personal identifiable information (PII) with DB2 RCAC becomes much easier. On the other hand, carelessly and needlessly propagating PII data throughout the database makes it virtually impossible to secure the sensitive elements in any effective way.


A pop quiz: is it easier to protect your tax ID stored in one table, or your tax ID stored in 1,000 different tables?


Best practices when defining and implementing your relational data model include:
  • Follow the normalization rules (3rd normal form is a good starting point)
  • All columns have an appropriate type, length, precision and scale
  • Use only one key column to represent the relationship between any two tables
  • Be consistent with column naming conventions
  • Define and use primary, unique, referential and check constraints
  • Define and implement an initial indexing strategy
  • Define and implement views to assist the database users
  • Document the model and keep it current (IBM Data Studio makes this relatively easy)

Now, this all sounds well and good when designing a new data model. But what if you are facing a legacy application with a poorly designed data model decades old?

This is where our targeted database modernization and re-engineering strategy comes into play!

Database modernization and re-engineering, if approached in a financially and technically responsible manner, will provide increased flexibility, extensibility, agility and scalability.

If you want to know more about designing and implementing a new database, or if you are faced with the task of enhancing an existing database to meet new requirements, please reach out. We invented the art of modernizing AS/400 databases, and we can help with reinventing yours.

1 comment: