Date: June 21, 2011
Author(s): Rob Williams
In a rare move, AMD has publicly ousted a leading industry benchmark, BAPCo’s SYSmark 2012, released mere weeks ago. The company believes both that the benchmark is unfair in its weighting of scores, and also that it’s irrelevant for the average consumer. We delve deep into these claims, and offer up our own opinions.
Earlier this month, BAPCo announced the latest version of its SYSMark benchmarking suite, 2012. As with previous releases, we’ve been awaiting our copy to arrive so that we can take it for a spin and then decide whether or not it’s suitable for integrating into some of our content, such as our motherboard and processor reviews.
After learning that AMD has pulled out of its BAPCo membership, and has even gone as far as to make a blog post stating the reasons behind it, we’re not so sure at this point that SM2012 stands much of a chance of being used in our testing. We won’t know for sure until we receive our copy and can put it to a personal test, but AMD does raise some important issues.
The sole purpose of a benchmark is to provide useful metrics to consumers in order for them to make well-informed purchasing decisions, and where that’s concerned, AMD’s CMO Nigel Dessau has stated that “AMD does not believe SM2012 achieves this objective.“
While AMD’s press release on the matter doesn’t go into great detail, Nigel explores the reasons in an executive blog post. There, he stresses the need for “open” benchmarks that help provide the information consumers need to better understand how one product compares to another, and in the end can feel more confident in their purchase. As we at Techgage have shared a similar mindset since our site’s inception, we couldn’t agree more.
So what is it about SYSmark 2012 that has left AMD feeling sour? For as long as we’ve known about SYSmark, AMD has been right there, along with Intel, Dell, HP, Microsoft, Samsung, Seagate and others. Yet, this becomes the first large benchmark that I can recall AMD publicly ousting. We’ve never seen it happen with a Futuremark benchmark, or a SPEC. What’s the deal?
AMD’s largest complaint is that SM2012 doesn’t represent the market well enough, employing high-end workloads that the regular consumer doesn’t care about, and some that even favor its leading competitor. Prior to its launch, Nigel states that AMD had attempted to see BAPCo correct its wrongs, and release an offering much more ‘transparent and processor-neutral’, but the attempt proved unsuccessful. During the debacle, BAPCo allegedly threatened the termination of AMD’s membership (BAPCo denies this; full response added to the end of the article).
Before going further, let’s take a look at all of the applications included with SYSmark 2012.
For a benchmark that has the goal of being a PC-wide test suite, the selection here isn’t too bad, though there is an obvious focus on workstation-esque applications. Of all the applications listed here, Autodesk’s 3ds Max is the only one we run as a stand-alone benchmark in our own testing (recent example).
Non-workstation applications include Adobe’s Acrobat, Dreamweaver, Flash player and Photoshop; Microsoft’s Internet Explorer 8 and Office 2010; Mozilla’s Firefox installer and the browser itself, and also Corel’s WinZip Pro. It’s interesting to note the introduction of ABBYY’s FineReader, an OCR (optical character recognition) application – a piece of software that couldn’t target the ordinary consumer much less.
It’s not the choice of applications alone that has AMD concerned, but rather the weight of some of them. In the same blog post, Nigel goes on, “a relatively large proportion of the SM2012 score is based on system performance rated during optical character recognition (OCR) and file compression activities âˆ’ things an average user will rarely if ever do.“
I might disagree to some extent with the file compression note, since compression as a whole is a big part of computing, even if it doesn’t directly involve using an application such as WinZip. Application and game installers make use of compression and decompression, for example, but that’s of course something difficult to monitor, and an application such as WinZip is much easier to quantify.
Further, AMD claims that while SM2012 includes 18 different applications and 390 separate measurements, a mere 7 applications and 10% of the total number of measurements determine the final score. We’ve contacted BAPCo for its input on this, but didn’t receive a response prior to publishing. Also, since there is no reviewer’s guide or whitepaper available on its website, we’ve been unable to investigate these claims.
Above all, a major beef AMD has with BAPCo is that SM2012 has little concern regarding heterogeneous computing – that is, making use of both the CPU and/or GPU. This might be the fairest point AMD makes here, as it’s something backed up by Intel (it has pushed its QuickSync rather hard, after all) and NVIDIA (whom arguably are responsible for most people first hearing about GPGPU).
This is an area where I once again feel inclined to agree. While most consumers are still not taking advantage of things like GPU video encoding, the GPU has been used in other areas where most people are affected, such as with video acceleration and UI rendering. At the same time, software that can make use of our GPU (Adobe Photoshop, as an example) continues to grow in numbers. Even our Web browsers can render pages with the help of our GPU.
Given the application list above, we can’t jump to conclusions and state that SM2012 does not make use of the GPU, because some of the applications listed can. We won’t be able to prove it to ourselves until we receive our copy and can monitor the GPU’s usage. For the sake of assuming that AMD wouldn’t have one of its executives lie in public, we’ll for the time-being just believe them.
After all said and done, does AMD make a compelling enough argument about SM2012? Should we ignore the prospect of integrating it into our testing? That remains to be seen, but I do admit that I’m inclined to withdraw from using it, unless there’s a specific scenario where using it still makes sense.
More of our own concerns on the following page.
The truth of the matter is, we haven’t used SYSmark 2007 Preview to a great deal in any of our testing for the simple fact that we’ve found its methodologies and metrics to be borderline useless. As far back as 2008, we’ve had discussions with both AMD and Intel regarding the issues we foresaw, and while AMD didn’t have much to comment on, Intel up to the current date has welcomed it to be used.
Similar to Futuremark’s PCMark suite of tests, SYSmark outputs a singular score where the higher the number, the better. The problem, though, is that in most cases, this number didn’t tell us a thing about the product being looked at. In fact, there were occasions where we’d compare a fast dual-core to a modest quad-core and both would appear equal – or the dual-core would win, due to its faster single-threaded performance. Of course, things aren’t quite so simple, and unless someone has super-specific needs, a quad-core is going to be much more preferable.
When Intel launched its first Nehalem-based processor, the Core i7-965 Extreme Edition, we expressed concerns to the technical folk there about how it compared in SYSmark to its previous high-end champ, the Core 2 QX9770. From a technical standpoint, the i7-965 was the far superior chip. It experienced a much improved architecture, the re-introduction of HyperThreading, a triple-channel memory controller, a faster bus, an L3 cache, and for an added boost, Turbo.
Clock for clock, Nehalem was expected to perform about 10~25% faster than its predecessor in single-threaded tasks, and up to 20~100% faster if multi-threading was involved. But according to SM2007, Intel’s latest and greatest (at the time) was in fact slower. Note these results where the Turbo mode was disabled on the i7-965 to make the clock-for-clock comparison more relevant:
Intel Core 2 QX9770
Intel Core i7-965
QX9770: Quad-Core, 3.20GHz, 2x6MB L2 Cache, 1600MHz FSB
i7-965: Quad-Core, 3.20GHz, 8MB L2 Cache, 3200MHz QPI (Turbo Off)
Overall, the faster i7-965 proved one point less useful in SYSmark’s tests, and if not for the major boost in 3D rendering performance Intel packed into the Nehalem microarchitecture, the overall score would have been even less. How did this compare to real-world performance at the time? Please note that Turbo was enabled in this case, adding at most 8% to the overall clock in stressful workloads.
Intel Core i7-965
Autodesk 3ds Max
Adobe Lightroom 2
TMPGEnc Xpress 4.5
ProShow Gold 3.2
3DMark Vantage CPU
QX9770: Quad-Core, 3.20GHz, 2x6MB L2 Cache, 1600MHz FSB
i7-965: Quad-Core, 3.20GHz, 8MB L2 Cache, 3200MHz QPI (Turbo On)
While the SYSmark results above shows these two processors as being equals, our real-world testing stated something quite different. The Core i7-965 didn’t just inch past the QX9770, in most cases it obliterated it. Please note that none of these benchmarks were “optimized” for Nehalem in particular. They were the same benchmarks used when the QX9770 first launched.
Is a benchmark, real-world or not, useful if it doesn’t manage to give a realistic perspective of a product’s performance, or a proper comparison of multiple products? We’re inclined to say no. While advanced, SYSmark simplifies things to such a large degree that its results just aren’t that useful or too telling.
After reading through Nigel’s blog, I’m having a hard time finding something specific to disagree on. As I’ve not been gung-ho about SYSmark and its methodologies for a while, not to mention the fact that we strive for real-world benchmarks here at Techgage, it’s nice to know someone else in the industry agrees.
It must be pointed out, however, that it’s a bit strange that AMD didn’t voice these opinions with the 2007 Preview launch nearly four years ago, since the exact same arguments could have been used there. From our recollection, AMD has on multiple occasions left and rejoined its post at BAPCo, for what reasons, I’m unsure.
While Intel doesn’t comment on issues regarding its competitors, it told us that it values companies such as BAPCo and continues to participate with many of them, as it’s important to have an industry-wide collaboration to ensure that meaningful performance benchmarks are being developed. With SM2012, Intel and AMD clearly don’t see eye-to-eye. At the same time, BAPCo is comprised of many other leading companies, such as Dell and HP… are we going to see others come out to agree with AMD, or is AMD going to be the awkward man out?
As always, we welcome your input regarding both AMD’s thoughts on the matters discussed here and also your own thoughts on SYSmark in general. Whereas PCMark 7 gives us similar simple overall scores for tests, its suites are a little more realistic for the average user. But does that make it a better benchmark? As evidenced in our content, our opinion is that nothing can beat real-world benchmarks, but we’re not opposed to using additional synthetics for the sake of completeness and interest from our readers – if we find value is indeed added.
June 21 (5:44PM EST) Addendum:
BAPCo has sent out an official statement:
Business Applications Performance Corporation (BAPCo) is a non-profit consortium made up of many of the leaders in the high tech field, including Dell, Hewlett-Packard, Hitachi, Intel, Lenovo, Microsoft, Samsung, Seagate, Sony, Toshiba and ARCintuition. For nearly 20 years BAPCo has provided real world application based benchmarks which are used by organizations worldwide. SYSmark 2012 is the latest release of the premiere application based performance benchmark. Applications used in SYSmark 2012 were selected based on market research and include Microsoft Office, Adobe Creative Suite, Adobe Acrobat, WinZip, Autodesk AutoCAD and 3ds Max, and others.
Advanced Micro Devices (AMD) was, until recently, a long standing member of BAPCo. We welcomed AMD’s full participation in the two year development cycle of SYSmark 2012, AMD’s leadership role in creating the development process that BAPCo uses today and in providing expert resources for developing the workload contents. Each member in BAPCo gets one vote on any proposals made by member companies. AMD voted in support of over 80% of the SYSmark 2012 development milestones, and were supported by BAPCo in 100% of the SYSmark 2012 proposals they put forward to the consortium.
BAPCo also notes for the record that, contrary to the false assertion by AMD, BAPCo never threatened AMD with expulsion from the consortium, despite previous violations of its obligations to BAPCo under the consortium member agreement.
BAPCo is disappointed that a former member of the consortium has chosen once more to violate the confidentiality agreement they signed, in an attempt to dissuade customers from using SYSmark to assess the performance of their systems. BAPCo believes the performance measured in each of the six scenarios in SYSmark 2012, which is based on the research of its membership, fairly reflects the performance that users will see when fully utilizing the included applications.
Have a comment you wish to make on this article? Recommendations? Criticism? Feel free to head over to our related thread and put your words to our virtual paper! There is no requirement to register in order to respond to these threads, but it sure doesn’t hurt!
Copyright © 2005-2019 Techgage Networks Inc. - All Rights Reserved.