Date: October 10, 2018
Author(s): Rob Williams
It sometimes feels like that no launch can take place nowadays without major controversy of some sort creeping up. The latest case of this is with Intel’s 9th-gen Core series launch. At the event, and on its website, Intel quotes performance numbers that were gathered through questionable testing practices. Let’s explore what’s going on here.
After Intel’s unveiling of its 9th-gen Core series processors (and the 28-core enthusiast-bound Xeon), some major controversy broke surrounding the gaming performance data the company provided. I didn’t find out about the issue until I woke up at the hotel the next morning, and since I was late, I wanted to wait for an interview with the source to be published by Gamers Nexus before putting thoughts to paper.
In the wee hours of the morning, Steve went ahead and published the completely uncut ~35 minute interview with Principled Technologies’ co-founder Bill Catchings (who also posted an official response (140KB PDF) to the controversy). PT was solely behind the gaming performance metrics provided by Intel at the event and in its official press release, and according to many enthusiasts who follow the space, the testing methodologies were flawed to the point where it gives the impression that the AMD Ryzen 7 2700X machine was deliberately gimped. I don’t believe that’s actually what happened, but it’s what most enthusiasts who actually know game benchmarking will believe.
I’ll also preface this by saying that I’ve respected PT for a while, as I’ve found the company to share similar values as Techgage, in that accurate testing is imperative. I’ve used the company’s MobileXPRT and WebXPRT benchmarks a number of times in the past, and have never had any suspicion that the tests were poorly designed in any way. What does seem clear to me, though, based on this latest report, is that the company’s experience with game testing isn’t as refined as its experience with mobile and application testing. There is a big difference; certain things that wouldn’t matter for one set of tests could greatly matter for another.
It’s not as though PT’s report (320KB PDF) had just a couple of flaws; if that were the case, the response would have been more about “Why did Intel let someone publish performance data before embargoed reviewers?”. This topic has been covered to death already in a mere day-and-a-half, so I won’t get into great detail, but there are some I have to tackle.
At the forefront, the Ryzen 2700X machine was tested with a stock cooler, whereas the Intel machines were tested with a beefier Noctua cooler. As part of PT’s post release announcement, it was later clarified that the Threadripper CPUs used the TR4 compatible NH-U14S TR4-SP3 coolers from Noctua, bringing parity to the Intel HEDT systems. However, for whatever reason, the stock Wraith Prism cooler was the only one that was different. Since AMD’s Ryzen 7 processors use temperature as a metric for its clock frequency, then it’s possible that the 2700X’s results could be slightly lower than expected because the stock cooler is only sufficient, rather than the best.
One of the stranger decisions on PT’s part was to enable Game Mode on both the Ryzen Threadripper 2950X and its little brother, Ryzen 7 2700X. On the 2950X, Game Mode is probably the more appropriate choice when testing only a single mode, since it can improve gaming performance in many cases due to the chip’s architectural design adding increased latency. On the 2700X, which doesn’t have the same design, enabling Game Mode can hurt performance in any title which can utilize more than four cores (taking into account as well that the remainder of the system will also be snagging some of those precious resources). This is because Game Mode disables one of the CCX units inside the CPU, effectively cutting its performance in half. Due to the feedback from other tech publications, PT will be redoing some of its testing for AMD CPUs in Creator mode.
Another decision I find fault with is that 64GB of memory was chosen across the board. For a gaming-specific report, that’s a strange decision. The interview painted a picture of disconnect when this topic was brought up, as Catchings said the platforms haven’t been around long enough to gauge the typical RAM amount properly. Personally, I think if I were to produce a report like this one (and I have made ones like it in the past), I’d look at trends. Most gamers don’t shell out $600+ for memory.
Another disconnect was seen throughout the video, when Catchings repeatedly said that his company was trying to “level the playing field”, when in reality, it was doing the opposite in some cases. It could be that the company was trying to be too thorough, over-analyzing the setups, ultimately doing more harm than good.
My guesstimation is that fewer than a tenth of a percent of gamers on a mainstream platform would opt for a memory solution that costs more than double what the processor does. The only exception I could think of for that right now would be those taking advantage of Intel QuickSync for encoding purposes; if you’re a Premiere Pro user, you know well that 64GB of memory is helpful, and with QS currently delivering market-leading encode performance, that’d be one justification for so much memory. But Premiere Pro isn’t a game – 64GB is workstation territory.
It’s worth noting that while some of those performance metrics, such as the Premiere Pro one, seem a little far-fetched, I do believe them. I saw a 94% improvement in Premiere Pro in a certain project when I tested the 2990WX against the i9-7980XE. Most people don’t use PP for straight encodes, where AMD does alright, so it seems fair enough.
Ultimately, my thoughts haven’t changed on anything about this report even after watching the entire Gamers Nexus interview. As I’ve said before, I respect Principled Technologies for what I’ve known of the company up to now, but I feel like gaming is not the area it should be tackling until it refines its own methodologies. As someone who’s benchmarked gaming GPUs for well over a decade, and with attention to detail that actually annoys me, I know how much nuance there is around proper game testing. Grand Theft Auto V had no graphics details listed in the initial report, but after being questioned by Gamers Nexus, PT did release the full settings used, and clarifying that they were aware of the settings changing between systems, and monitored them accordingly.
Something I take issue with that most haven’t seemed to comment on is the fact that all of this in-depth testing was completed in a single day. The report specifically writes: “On October 4, 2018, we finalized the hardware and software configurations we tested. Updates for current and recently released hardware and software appear often, so unavoidably these configurations may not represent the latest versions available when this report appears. We concluded hands-on testing on October 4, 2018.”
That statement raises an alarm for me, if I’m honest – based entirely on the fact that I’ve been professionally benchmarking for over a decade, and I know what kind of time it takes to produce results that have a high level of confidence behind them. To test absolutely everything in a single day seems unbelievably rushed. This gets compounded by the fact that there were 16 systems under test, 8 configurations with a duplicate to compare against (as a result, 6 tests were performed per game, 3 on each system).
Perhaps the strangest thing about all of this is that Intel didn’t need this report at all. Its internal testing would have been enough, because based on paper, we know that the Core i9-9900K is going to become the best gaming CPU available. It’ll have a sweet spot amount of cores and the highest Turbo frequencies we’ve ever seen on a retail product. That’s not just notable, it’s exciting. And yet, this report likely took more away from the launch than what it provided. This entire debacle could have been avoided, and personally, I believe it’s an oversight on Intel’s part to have even run with the reported information.
As someone who’s attended a number of Intel events in the past, I do want to step aside from the issue for a second and commend the company for a great event overall. I personally found that the logistics behind the entire thing were the best I’ve ever seen from the company. Its event coordinators knew what they were doing, and overall, everything went off without a hitch as far as I could tell. I would call it the best-run Intel event I’ve ever been to, if I were to ignore this game results fun for a moment.
I think, or at least hope, that there are going to be some good things to come out of this. Companies should now realize more than ever that this kind of behavior and shadiness does not go unnoticed. Intel’s not been the only one in people’s targets. AMD, NVIDIA, and others have felt the wrath of eagle-eyed observers before. This time, though, things were pushed just a bit too far, and no one who cares about it has been able to keep shut.
I would like to give Bill Catchings some major credit for doing the interview, as well. While many questions were left unanswered, it couldn’t have been easy to sit down with someone who had just that very day published a scathing video about your company’s methodologies. I still believe that PT’s core principles remain in tact. I just don’t think that this is a shining example of the company’s work. I am the first person to admit I am not great at all sorts of testing. Some analysts and testing firms might need to learn that as well.
This is also the kind of thing that can help us as reviewers and benchmarkers make sure that our own methodologies are as rock-solid and faultless as possible. As Catchings rightly says in his interview, there’s absolutely no way to perfect testing. I believe there’s such thing as coming as close as possible, though. When you don’t actually level the playing field, that’s moving further away from the goal.
Nonetheless, reviews for the i9-9900K are due some time next week, so it’s only a matter of time before we all have a much more informed opinion about Intel’s latest and greatest. I (ironically?) haven’t focused on gaming much at all on the CPU side for a little while because focus has been heightened on workstation testing, but this forthcoming review will have a few gaming tests added to paint at least a modest picture of what’s going on. It goes without saying that Gamers Nexus and others are going to batter this chip with many more games than I will, but I’ll match it with applications. This is one launch that will be well covered.
Copyright © 2005-2019 Techgage Networks Inc. - All Rights Reserved.