未加星标

New Utility Can Double AMD Threadripper 2990WX Performance

字体大小 | |
[系统(windows) 所属分类 系统(windows) | 发布者 店小二05 | 时间 2019 | 作者 红领巾 ] 0人收藏点击收藏

New Utility Can Double AMD Threadripper 2990WX Performance

AMD’s 32-core 2990WX Threadripper CPU has always been a bit of an uncertain proposition. While undeniably fast in certain scenarios, the chip has marked performance regressions in other tests, and doesn’t always outperform the 16-core Threadripper 2950X. Now, there’s a utility, CorePrio, that can be used to restore much of the 2990WX’s missing performance under windows 10.

When the 2990WX
New Utility Can Double AMD Threadripper 2990WX Performance
shipped, the explanation for its occasional performance drops focused on its memory access system and controller configuration. The thinking was that having 32 CPU cores connected to memory across just four memory channels caused intrinsic bandwidth congestion, starving some cores for memory access. But there have been signs of scheduler problems as well ― it’s been known for some months that the 2990WX performs better under linux than when running Windows, and that’s a definite sign of an underlying OS issue as opposed to a hardware problem.
New Utility Can Double AMD Threadripper 2990WX Performance

Memory access on the 2990WX.

Level1Techs has published an extensive report into their investigation of performance on the 2990WX. The initial assumption that memory bandwidth congestion is responsible for lower overall performance, while not wrong in all cases, has been proven incomplete. Level1 found that the same performance regressions were present in an Epyc 7551 they tested, which had eight memory channels instead of Threadripper’s four. Again, performance under Linux was fine, but performance in Windows was impacted. But Level1 also found strange behavior associated with changing Windows CPU affinities, and how this impacted overall performance testing.


New Utility Can Double AMD Threadripper 2990WX Performance

Data and chart by Level1.

What their investigation ultimately revealed is problems with how certain applications move workloads between cores in NUMA-enabled CPUs with more than one NUMA node. Level1 writes: “When only one NUMA node is recommended via the ‘ideal CPU’ the windows kernel seems to spend half the available CPU time just shuffling threads between cores.”

They continue:

Here’s an interesting twist: If you only have one OTHER NUMA node windows seems to fall back to allowing the threads to establish themselves on the second NUMA node… This is most likely related to a bugfix from Microsoft for 1 or 2 socket Extreme Core Count (XCC) Xeons wherein a physical Xeon CPU has two numa nodes. In the past (with Xeon V4 and maybe V3), one of these NUMA nodes has no access to I/O devices (but does have access to memory through the ring bus). If that’s true, then that work-around to make sure this type of process stays on the “ideal CPU” in the same socket has no idea what to do when there is more than one other NUMA node in the same package to “fail over” to.

The solution to this is a utility named CorePrio :


New Utility Can Double AMD Threadripper 2990WX Performance

CorePrio solves this problem and allows for threads to be scheduled evenly across the CPUs rather than Windows spending all of its time trying to shuffle them across the die. It looks as though the reason for sharp performance regressions with the 2990WX was caused at least in part by Windows spending far more time moving workloads from CPU to CPU than it ever spent actually executing work. Obviously, this won’t boost Threadripper’s performance in applications where it already scaled well, but it should fix the performance regressions in multiple applications.

It’s not clear if the memory subsystem is still implicated in this yet. If threads are being misallocated on the wrong NUMA node, it’s possible that memory accesses are being run mostly or entirely through a single memory controller. This would explain why an eight-channel Epyc in NUMA mode gives the same performance (with allowance for clock speed) as a four-channel TR. And there may well be applications that don’t scale well in the 2990WX’s NUMA configuration for reasons unrelated to any shortcomings in the Windows 10 scheduler.

The full scope of the bug and its potential fixes haven’t been fully fleshed out yet, if the “fixes unknown Windows perf issue” wasn’t a clue above. Microsoft and AMD have not yet issued formal responses and it’s not clear what the timeline is for fixing this problem via OS update. But if you’re a 2990WX owner or were interested in becoming one, this could change the calculus on whether the chip is worth investing in ― provided you’re a very particular kind of customer in the first place, obviously. Average and even not-so-average gamers need not apply, as chips like the 2990WX play in very rarified space to start with.

Now Read:

本文系统(windows)相关术语:三级网络技术 计算机三级网络技术 网络技术基础 计算机网络技术

代码区博客精选文章
分页:12
转载请注明
本文标题:New Utility Can Double AMD Threadripper 2990WX Performance
本站链接:https://www.codesec.net/view/628502.html


1.凡CodeSecTeam转载的文章,均出自其它媒体或其他官网介绍,目的在于传递更多的信息,并不代表本站赞同其观点和其真实性负责;
2.转载的文章仅代表原创作者观点,与本站无关。其原创性以及文中陈述文字和内容未经本站证实,本站对该文以及其中全部或者部分内容、文字的真实性、完整性、及时性,不作出任何保证或承若;
3.如本站转载稿涉及版权等问题,请作者及时联系本站,我们会及时处理。
登录后可拥有收藏文章、关注作者等权限...
技术大类 技术大类 | 系统(windows) | 评论(0) | 阅读(188)