Creation of the Db2 for i SMP (Symmetric Multiprocessing) licensed feature was one of the coolest projects that I was able to work on during my time on the Db2 development team. Figuring out how to add parallel processing into the Db2 engine was both interesting and challenging work. I did some searching through the InfoWorld archives (remember when hard copy IT periodicals were a thing…) and figured out that the SMP feature recently turned 25 years old in November 2020. As the old saying goes, time flies when you’re having fun.
As part of that development project, I also had the
opportunity to help one of the first customers use Db2 SMP in their shop. While
the SMP feature and underlying hardware have changed over the years, the items to
consider for a successful Db2 SMP implementation have not.
I’ve tried to break the success factors into the following
ABC acrostic –might be a bit of a stretch, but it makes for a catchy title😉
·
Available system resources
·
Balanced expectations
·
Controlled usage of SMP
Available system resources
With Db2 for i SMP, the basic approach is dividing the work
for a query across multiple threads and running those threads in parallel across
multiple processor/cores to shorten the amount of time it takes to run the
query (check out Mike’s nice graphic).
The Db2 engine is using more system resources in order to reduce the overall amount
of time it takes to run your query.
Resource usage is being traded for time. That’s why
reviewing the availability of system resources is a critical step to perform
before buying and implementing Db2 SMP. If you don’t have the system resources
to trade, then you’re not going to realize the performance benefits of Db2
SMP. For example, if CPU utilization is currently
running at 80-85%, adding Db2 SMP to use more system resources is not going to
have a positive impact on system performance.
The system resources also need to be balanced. CPU resources are not the only system resource
consumed by Db2 SMP. Each thread used by
Db2 SMP needs a chunk of memory to perform its segment of the query and that
work can involve performing I/O on your database objects. As a result, your
system needs sufficient memory and a properly sized I/O subsystem to support
the increased CPU usage. If the query
optimizer finds that this combination of resources is not available, then the
optimizer will not use parallel methods – even if you’ve installed and activated
the Db2 for i SMP licensed feature.
Balanced expectations
Assuming you’ve determined that your system has a balanced
set of resources available to support Db2 SMP parallel processing. The next
step is setting the proper expectations on the type of database requests and
workloads that may run faster with Db2 SMP. Some people tend to believe that DBb2
SMP’s parallel processing will be the silver bullet
for all of their performance problems.
Running a query with parallel processing adds overhead
because there is work involved in dividing a query into multiple parts and
distributing the work among threads.
This startup overhead means that Db2 SMP will not be a great benefit to
short-running queries that are common in transactional workloads. Think about
your own household - it’s not uncommon for younger kids to want to help parents
with household chores, but often parents chose to do the chores themselves to
avoid the overhead of involving and training their kids. Your time would be better spent trying to tune
short-running queries than hoping that parallel processing will magically improve
performance.
Longer running queries are the best performance targets for
Db2 SMP because they have a longer runtime which can quietly hide the startup
overhead associated with parallel processing.
If Db2 SMP can reduce a long running query from 10 minutes to 5 minutes,
no one is really going to notice that a hundred milliseconds was spent setting
up threads for parallel processing.
You might have noticed that I keep using queries as the
parallel processing example. That is because Db2 SMP does not enable all
database requests to use parallel processing.
Queries from SQL and non-SQL interfaces can use Db2 SMP, but native
record-level access requests do not. Db2 SMP also does not support parallel
inserts, updates, and deletes. The only
type of database change operation that can use Db2 SMP is Index Maintenance –
however, this parallel processing is only done when the index updates are done
as a result of a blocked Insert or write request. Db2 SMP can also utilize
parallel processing to improve the performance of index creations and
reorganize operations.
When setting expectations for the performance benefits, you
need to make sure that everyone understands that it’s longer running queries
that will be the primary benefactor from Db2 SMP and that not all database
requests can use parallel processing.
Controlled usage of SMP
Once Db2 SMP has been installed on a system, it must be
activated before the Db2 engine will consider using parallel processing on a
request. There are several different
interfaces for enabling parallel processing which include: CHGQRYA CL command,
QAQQINI PARALLEL_DEGREE option, SET CURRENT DEGREE statement or the QQQRYDEGREE
system value.
Based on the discussion in the previous section, you should
try to limit parallel processing enablement to only those jobs or requests that
will benefit from Db2 SMP. Enabling Db2
SMP for all requests just adds overhead to query optimization and can result in
your system resources being overwhelmed if parallel processing is used. On a
transaction-oriented system, you probably should scope parallel enablement to a
limited set of requests and workloads from Db2 SMP. In contrast, you could cast
a pretty wide parallel enablement on a data warehousing system which features
longer-running queries.
In addition to figuring out which jobs and requests to
enable parallel processing on, you should consider when to activate parallel
processing. It could be that your server has high utilization of resources
during the day, but resources to spare during off hours. The enablement interfaces make it easy to
turn Db2 SMP on or off.
In terms of which parallel degree value to use, I recommend
starting with the *OPTIMIZE value. With the
*OPTIMIZE value, the Db2 optimizer tries to choose a degree of parallel
processing that results in an implementation that is a good neighbor in terms
of sharing system resources with other jobs.
A more cautious approach would be setting the QAQQINI PARALLEL_DEGREE
option with a value of *OPTIMIZE 50.
This setting tells Db2 to use the good neighbor approach, but dial the parallel
processing back by 50%.
Hopefully, you now have a better understanding of when and how to use SMP. If you still think that the Db2 for i SMP license feature may be a fit after reading this, our Lab Services team can provide a trial version of the feature for evaluation to help with the purchasing decision. Also, our team is available to help you with Db2 performance tuning or teaching you how to tune whether it involves Db2 SMP or not – just contact me.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.