Wednesday, February 17, 2021

The ABCs of Effective DB2 SMP Usage

Creation of the Db2 for i SMP (Symmetric Multiprocessing) licensed feature was one of the coolest projects that I was able to work on during my time on the Db2 development team. Figuring out how to add parallel processing into the Db2 engine was both interesting and challenging work. I did some searching through the InfoWorld archives (remember when hard copy IT periodicals were a thing…) and figured out that the SMP feature recently turned 25 years old in November 2020.  As the old saying goes, time flies when you’re having fun.

As part of that development project, I also had the opportunity to help one of the first customers use Db2 SMP in their shop. While the SMP feature and underlying hardware have changed over the years, the items to consider for a successful Db2 SMP implementation have not.

I’ve tried to break the success factors into the following ABC acrostic –might be a bit of a stretch, but it makes for a catchy title😉

·        Available system resources

·        Balanced expectations

·        Controlled usage of SMP

Available system resources

With Db2 for i SMP, the basic approach is dividing the work for a query across multiple threads and running those threads in parallel across multiple processor/cores to shorten the amount of time it takes to run the query (check out Mike’s nice graphic). The Db2 engine is using more system resources in order to reduce the overall amount of time it takes to run your query. 

Resource usage is being traded for time. That’s why reviewing the availability of system resources is a critical step to perform before buying and implementing Db2 SMP. If you don’t have the system resources to trade, then you’re not going to realize the performance benefits of Db2 SMP.  For example, if CPU utilization is currently running at 80-85%, adding Db2 SMP to use more system resources is not going to have a positive impact on system performance. 

The system resources also need to be balanced.  CPU resources are not the only system resource consumed by Db2 SMP.  Each thread used by Db2 SMP needs a chunk of memory to perform its segment of the query and that work can involve performing I/O on your database objects. As a result, your system needs sufficient memory and a properly sized I/O subsystem to support the increased CPU usage.  If the query optimizer finds that this combination of resources is not available, then the optimizer will not use parallel methods – even if you’ve installed and activated the Db2 for i SMP licensed feature.

Balanced expectations

Assuming you’ve determined that your system has a balanced set of resources available to support Db2 SMP parallel processing. The next step is setting the proper expectations on the type of database requests and workloads that may run faster with Db2 SMP. Some people tend to believe that DBb2 SMP’s parallel processing will be the silver bullet for all of their performance problems. 

Running a query with parallel processing adds overhead because there is work involved in dividing a query into multiple parts and distributing the work among threads.  This startup overhead means that Db2 SMP will not be a great benefit to short-running queries that are common in transactional workloads. Think about your own household - it’s not uncommon for younger kids to want to help parents with household chores, but often parents chose to do the chores themselves to avoid the overhead of involving and training their kids.  Your time would be better spent trying to tune short-running queries than hoping that parallel processing will magically improve performance.

Longer running queries are the best performance targets for Db2 SMP because they have a longer runtime which can quietly hide the startup overhead associated with parallel processing.  If Db2 SMP can reduce a long running query from 10 minutes to 5 minutes, no one is really going to notice that a hundred milliseconds was spent setting up threads for parallel processing. 

You might have noticed that I keep using queries as the parallel processing example. That is because Db2 SMP does not enable all database requests to use parallel processing.  Queries from SQL and non-SQL interfaces can use Db2 SMP, but native record-level access requests do not. Db2 SMP also does not support parallel inserts, updates, and deletes.  The only type of database change operation that can use Db2 SMP is Index Maintenance – however, this parallel processing is only done when the index updates are done as a result of a blocked Insert or write request. Db2 SMP can also utilize parallel processing to improve the performance of index creations and reorganize operations.

When setting expectations for the performance benefits, you need to make sure that everyone understands that it’s longer running queries that will be the primary benefactor from Db2 SMP and that not all database requests can use parallel processing.

Controlled usage of SMP

Once Db2 SMP has been installed on a system, it must be activated before the Db2 engine will consider using parallel processing on a request.  There are several different interfaces for enabling parallel processing which include: CHGQRYA CL command, QAQQINI PARALLEL_DEGREE option, SET CURRENT DEGREE statement or the QQQRYDEGREE system value.

Based on the discussion in the previous section, you should try to limit parallel processing enablement to only those jobs or requests that will benefit from Db2 SMP.  Enabling Db2 SMP for all requests just adds overhead to query optimization and can result in your system resources being overwhelmed if parallel processing is used. On a transaction-oriented system, you probably should scope parallel enablement to a limited set of requests and workloads from Db2 SMP. In contrast, you could cast a pretty wide parallel enablement on a data warehousing system which features longer-running queries.

In addition to figuring out which jobs and requests to enable parallel processing on, you should consider when to activate parallel processing. It could be that your server has high utilization of resources during the day, but resources to spare during off hours.  The enablement interfaces make it easy to turn Db2 SMP on or off.

In terms of which parallel degree value to use, I recommend starting with the *OPTIMIZE value.  With the *OPTIMIZE value, the Db2 optimizer tries to choose a degree of parallel processing that results in an implementation that is a good neighbor in terms of sharing system resources with other jobs.  A more cautious approach would be setting the QAQQINI PARALLEL_DEGREE option with a value of *OPTIMIZE 50.  This setting tells Db2 to use the good neighbor approach, but dial the parallel processing back by 50%. 

Hopefully, you now have a better understanding of when and how to use SMP.  If you still think that the Db2 for i SMP license feature may be a fit after reading this, our Lab Services team can provide a trial version of the feature for evaluation to help with the purchasing decision.  Also, our team is available to help you with Db2 performance tuning or teaching you how to tune whether it involves Db2 SMP or not – just contact me.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.