T-SQL Tuesday #83: Resource Governor CAP_CPU_PERCENT: It just runs faster?
T-SQL Tuesday #83: We’re still dealing with the same problems Welcome to T-SQL Tuesday #83 being hosted this month by Andy Mallon ( blog | @AMTwo ). This month’s topic is “We’re still dealing with the same problems”. If you would like to participate in this month’s blog party, go to Andy’s announcing blog post: T-SQL Tuesday #83: We’re still dealing with the same problems

.

For my part of this blog party, I want to talk about CAP_CPU_PERCENT for Resource Governor. This feature was added to Resource Governor in SQL Server 2012, and Books Online defines it as:

Specifies a hard cap on the CPU bandwidth that all requests in the resource pool will receive. Limits the maximum CPU bandwidth level to be the same as the specified value. value is an integer with a default setting of 100. The allowed range for value is from 1 through 100.

CAP_CPU_PERCENT

The need for this setting came about because MAX_CPU_PERCENT is not applied unless the server is busy. This could lead to a situation where queries in a low priority resource pool starts running while the server is idle and are allowed to consume all the CPU they can. Then high priority queries spin up, and they can’t immediately get the CPU they need due to the low priority queries not being capped. CAP_CPU_PERCENT came along and was designed to set a hard limit that the queries in a pool could not go over even if the server is idle. For example, if you cap the CPU at 25%, the queries in the pool will not exceed 25% no matter how idle the server is.

Problem solved, right?

RESOURCE_GOVERNOR_IDLE

Well, no, problem not solved. There was a problem with the way CPU capping was implemented. They introduced a new wait type called RESOURCE_GOVERNOR_IDLE. Often, new wait types are just classifying an existing wait to a specific wait type instead of a generic wait type. In the case of RESOURCE_GOVERNOR_IDLE, they created a new wait. I’m not going to go into detail on this wait type becasue it has already been explained very well by Jack Li on the CSS SQL Server Engineers blog: What is RESOURCE_GOVERNOR_IDLE and why you should not ignore it completely .

The big problem with the way it was implemented was that the non-capped queries ended up waiting in line for the CPU for their turn at it while the capped queries sat idle not running. This is negligible if you have very few queries running, but if that was the case, you probably wouldn’t be using Resource Governor and CPU cap in the first place. While the capped query was sitting in the idle wait type, it still basically blocked any other query from using that CPU for the duration of the capped query’s quantum.

In other words, non-capped queries took longer too. Basically, it was useless on a busy system.

CPU_CAP_PERCENT Rewritten for SQL Server 2016

To fix this issue, they completely rewrote how CPU_CAP_PERCENT is handled in SQL Server 2016 (no the new method won’t be back-ported to SQL 2012 or 2014 because it was too large of a rewrite to be feasible). The new method of calculating CPU_CAP was explained to me under NDA, and I was not able to get permission to share that info, so let’s just say that it was designed to not use the RESOURCE_GOVERNOR_IDLE wait any more so that non-capped queries would be able to use the CPU when the capped queries are running.

I ran some tests of my own to find out. For my tests, I did a lot of trial and error to find the right query to use. I ended up using a query that would be CPU bound and not bound to anything else to avoid being influenced by things like disk I/O. This is the query that was used for all of the below tests:

-- Executed prior to the timed runs so not counted in timings
If OBJECT_ID('#SampleMath') Is Not Null
Drop Table #SampleMath;
Go
Create Table #SampleMath(
PK bigint identity(1,1))
Go
Insert Into #SampleMath Default Values;
Go 400
-- Executed for test runs
Set Statistics Time On;
Select Sum(log(tS1.PK + tS2.PK))
From #SampleMath tS1
Cross Join #SampleMath tS2
Cross Join #SampleMath tS3
Cross Join #SampleMath tS4
Option(MaxDOP 32);
Set Statistics Time Off; My test machine:

OS:windows Server 2012 R2 Standard Edition

CPU:Intel Xeon E5-2667 v3 @ 3.20GHz

Sockets:2

Cores:16

Logical processors:32

RAM:256 GB

SQL Server:SQL Server 2016 RTM, Enterprise Edition

As you can see above, I am increasing the degree of parallelism to 32 to ensure that each instance of the query consumes all 32 CPUs. The query above was originally written to trigger CPU pressure (math is hard for SQL Server).

Test Results

These were the general results. The tests were repeated several times and the timings and wait stats always came out the same way. One of the key wait stats I’m seeing below is SOS_SCHEDULER_YIELD. It’s low when the query runs uncapped by itself. It’s low on Query 1 and high on Query 2 (both uncapped), I’m assuming due to the MAX CPU and priority differences. In the 3rd run, the wait was high on both queries (Q1 uncapped, Q2 capped at 50%).

Also, not that RESOURCE_GOVERNOR_IDLE rears its ugly head in run 3. To a small degree on Q1 and a large degree on Q2.

I would expect Query 1 to be considerably faster in run 3 and have less CPU contention with Q2 capped at 50%. It does not seem to play out that way.

Run #1:Query 1 by itself (priority high, max CPU 90%, CPU CAP 100)

Run time: 7 minutes 10 seconds RG Name Importance Max CPU% Cap CPU% Max Mem% Wait type Waiting Tasks Wait Time ms Max Wait Time ms Signal Wait Time ms DbaGroup High 100 100 90 PAGELATCH_SH 20 16 4 16 DbaGroup High 100 100 90 PAGELATCH_EX 14 16 4 16 DbaGroup High 100 100 90 SOS_SCHEDULER_YIELD 2 0 0 0 DbaGroup High 100 100 90 WRITELOG 6 278 252 4 DbaGroup High 100 100 90 PREEMPTIVE_OS_REPORTEVENT 6 17 12 0 DbaGroup High 100 100 90 MEMORY_ALLOCATION_EXT 986 0 0 0 Run #2:

Query 1 (priority medium, max CPU 90%, CPU CAP 100), Query 2 (priority medium, max CPU 80%, CPU CAP 100)

Q1 run time: 12 minutes 47 seconds RG Name Importance Max CPU % Cap CPU % Max Mem % Wait type Waiting Tasks Wait Time ms Max Wait Time ms Signal Wait Time ms DbaGroup High 100 100 90 LATCH_EX 32 545 32 36 DbaGroup High 100 100 90 SOS_SCHEDULER_YIELD 1 0 0 0 DbaGroup High 100 100 90 CMEMTHREAD 1 4 4 4 DbaGroup High 100 100 90 CXPACKET 165 701826 44401 339 DbaGroup High 100 100 90 EXECSYNC 3 36 16 1 DbaGroup High 100 100 90 MEMORY_ALLOCATION_EXT 283 0 0 0 DbaGroup High 100 100 90 RESERVED_MEMORY_ALLOCATION_EXT 65 0 0 0 DbaGroup High 100 100 90 SESSION_WAIT_STATS_CHILDREN 35 333 36 47 Q2 run time: 14 minutes 8 seconds RG Name Importance Max CPU % Cap CPU % Max Mem % Wait type Waiting Tasks Wait Time ms Max Wait Time ms Signal Wait Time ms NonCoreAppGroup Low 80 100 80 LATCH_EX 62 6 0 0 NonCoreAppGroup Low 80 100 80 SOS_SCHEDULER_YIELD 3241040 13385945 5003 13385071 NonCoreAppGroup Low 80 100 80 CMEMTHREAD 85 2 0 1 NonCoreAppGroup Low 80 100 80 CXPACKET 561 774691 49175 7 NonCoreAppGroup Low 80 100 80 EXECSYNC 6 0 0 0 NonCoreAppGroup Low 80 100 80 MEMORY_ALLOCATION_EXT 1082 1 0 0 NonCoreAppGroup Low 80 100 80 RESERVED_MEMORY_ALLOCATION_EXT 65 0 0 0 NonCoreAppGroup Low 80 100 80 SESSION_WAIT_STATS_CHILDREN 100 640 56 56 Run #3:

Query 1 (priority medium, max CPU 90%, CPU CAP 100), Query 2 (priority medium, max CPU 80%, CPU CAP 50)

Q1 run time: 12 minutes 45 seconds RG Name Importance Max CPU % Cap CPU % Max Mem % Wait type Waiting Tasks Wait Time ms Max Wait Time ms Signal Wait Time ms DbaGroup High 100 100 90 LATCH_EX 32 6 0 0 DbaGroup High 100 100 90 SOS_SCHEDULER_YIELD 3326202 10503422 210 10502503 DbaGroup High 100 100 90 CMEMTHREAD 82 2 0 1 DbaGroup High 100 100 90 CXPACKET 544 632650 48158 30 DbaGroup High 100 100 90 EXECSYNC 1 0 0 0 DbaGroup High 100 100 90 RESOURCE_GOVERNOR_IDLE 635 116 9 0 DbaGroup High 100 100 90 MEMORY_ALLOCATION_EXT 281 0 0 0 DbaGroup High 100 100 90 RESERVED_MEMORY_ALLOCATION_EXT 65 0 0 0 DbaGroup High 100 100 90 SESSION_WAIT_STATS_CHILDREN 112 1332 64 84 Q2 run time: 14 minutes 49 seconds RG Name Importance Max CPU % Cap CPU % Max Mem % Wait type Waiting Tasks Wait Time ms Max Wait Time ms Signal Wait Time ms NonCoreAppGroup Low 80 50 80 LATCH_EX 46 5 0 0 NonCoreAppGroup Low 80 50 80 PAGELATCH_UP 1 0 0 0 NonCoreAppGroup Low 80 50 80 SOS_SCHEDULER_YIELD 3211854 14564768 4270 14563902 NonCoreAppGroup Low 80 50 80 CMEMTHREAD 56 1 0 0 NonCoreAppGroup Low 80 50 80 CXPACKET 590 1011755 66165 7 NonCoreAppGroup Low 80 50 80 EXECSYNC 4 0 0 0 NonCoreAppGroup Low 80 50 80 RESOURCE_GOVERNOR_IDLE 128305 1182424 77 0 NonCoreAppGroup Low 80 50 80 MEMORY_ALLOCATION_EXT 264 0 0 0 NonCoreAppGroup Low 80 50 80 RESERVED_MEMORY_ALLOCATION_EXT 65 0 0 0 NonCoreAppGroup Low 80 50 80 SESSION_WAIT_STATS_CHILDREN 88 422 40 40 Conclusions

CPU_CAP_PERCENT is now in it’s third major version and was rewritten for the current version. The problem isn’t fixed though. It still affects performance of the uncapped queries as much as it did previously. It just shows up as different waits.

Not fixed. Does not run faster.

本文数据库(mssql)相关术语:熊片数据库 mssql数据库 oracle数据库 pubmed数据库 access数据库 万方数据库

主题: CPUSQLTIRYWindowsCSSWindows ServWindows ServerDUSQL Server
分页:12
转载请注明
本文标题:T-SQL Tuesday #83: Resource Governor CAP_CPU_PERCENT: It just runs faster?
本站链接:http://www.codesec.net/view/482555.html
分享请点击:


1.凡CodeSecTeam转载的文章,均出自其它媒体或其他官网介绍,目的在于传递更多的信息,并不代表本站赞同其观点和其真实性负责;
2.转载的文章仅代表原创作者观点,与本站无关。其原创性以及文中陈述文字和内容未经本站证实,本站对该文以及其中全部或者部分内容、文字的真实性、完整性、及时性,不作出任何保证或承若;
3.如本站转载稿涉及版权等问题,请作者及时联系本站,我们会及时处理。
登录后可拥有收藏文章、关注作者等权限...
技术大类 技术大类 | 数据库(mssql) | 评论(0) | 阅读(38)