Date: December 20, 2010
Author(s): Rob Williams
Not wanting to end 2010 without the last word, AMD unveiled its Radeon HD 6900 cards last week. These cards bring a couple of interesting features, including a revamped architecture, improved power handling, dual BIOS support, EQAA anti-aliasing and more. With NVIDIA’s GTX 570 just launched, let’s see where AMD stands with its new cards.
Like most others in recent years, 2010 was an interesting one where graphics cards were concerned. AMD dominated the scene in terms of performance/$ to kick things off, and then NVIDIA delivered the most powerful single-GPU solution in the springtime with its GeForce GTX 400 series. Fast-forward to the late fall, and NVIDIA followed-up its big launch with GTX 500 cards. The GTX 580 in particular helped reaffirm NVIDIA as the performance leader, so what’s an AMD to do?
Close up 2010 with an interesting launch, of course. I’d give some belated long introduction here, but there’s not much need. AMD’s HD 6900 Radeon series (Cayman) launch hasn’t been the best-kept secret, and there are few who wouldn’t have seen it coming. What matters is whether or not it can help put AMD back on top in terms of raw performance, rather than win on pricing.
Has it? That’s what we’re going to find out here, but even if not, AMD has created some interesting options. The company has stated that the HD 6970 best competes with NVIDIA’s GeForce GTX 570, so right away we can get an idea of what to expect. The HD 6950? Well, that’s a card in a class of its own – literally. NVIDIA doesn’t have an offering at the current time to compete from a pricing standpoint, which isn’t a situation we run into all too often.
The Radeon HD 6900 GPUs were originally destined to be built on a 32nm process, but due to technical problems within AMD’s fabricator, TSMC, it was forced to stick with 40nm. As a result, we’re not going to see incredible performance gains here, but we do see gains nonetheless, along with other enhancements.
Because AMD was forced to stick with 40nm, it decided to focus more on feature and architecture enhancements than the adding of more SIMD engines and cores. Much of what was introduced with the HD 6800 is again brought back here, but with additional upgrades which we’ll tackle on the following page. Priced at $299 for the HD 6950 and $369 for the HD 6970, AMD’s highest-end single-GPU card could still be considered affordable – at least, much more so than the $500 GeForce GTX 580 from NVIDIA.
|Radeon HD 6970|
|Radeon HD 6950|
|Radeon HD 6870|
|Radeon HD 6850|
|Radeon HD 5970|
1600 x 2
|Radeon HD 5870 Eyefinity 6|
|Radeon HD 5870|
|Radeon HD 5850|
|Radeon HD 5830|
|Radeon HD 5770|
|Radeon HD 5750|
512MB – 1GB
Compared to the HD 6870, the HD 6970 has a lighter clock speed, but it increases the core count to 1,536, from 1,120 – a 37% increase. At the same time, GDDR5 clocks have been boosted to 1,350MHz, from 1,050MHz. Overall, nice increases, but it’s all made even better by the fact that AMD has supplied both of its HD 6900 cards with 2GB of memory, up from 1GB. This should be welcome by those who both run high resolutions and high detail settings. It can be assumed that NVIDIA’s next-gen GPUs will also use 2GB of GDDR.
Proving sufficient once again, we’re opting to utilize AMD’s stock images, since they are spot on with the actual product. Both the HD 6970 and HD 6950 are -identical- to look at aside from the name pasted on. Both weigh about the same, are just as long, feature the same video outputs, and so forth. The differences boil down to the chip and the BIOS.
AMD introduced 5x display output support with the HD 6800 series, and that returns as expected with this series. The included outputs are 2x DVI, 1x HDMI, and 2x mini-DP (DisplayPort). The DisplayPort ports can be used in conjunction with an expander hub to support up to six displays off of the two ports. This is helpful for those looking to take that route, but unlike the “Eyefinity 6” edition we saw with the last generation of cards, the requirement of a hub here isn’t desirable, as they run around $100.
Both cards require a PSU of around 550W, and for the sake of CrossFireX, about 200W could be added on top of that. The HD 6970 requires an 8-pin and 6-pin PCIe power connector, while the HD 6950 can get by with two 6-pin PCIe connectors.
On the next page, we’ll talk a bit about what AMD has brought to the table with its HD 6900 series, and then we’ll get straight into a look at the performance.
As mentioned at the outset, AMD has built its Radeon HD 6900 GPUs on a 40nm process, which is the same one used for the entire HD 5000 series and also the mid-range HD 4770 card. As nice as it would have been for AMD to deliver 32nm releases here, it decided to not mope around the idea and instead pursued ways to improve the architecture further, and also amp up performance where possible.
To sum up the HD 6900 series from a technical standpoint, AMD classifies it by saying, “The time for high performance DirectX 11 is now!”, and it can say this as the performance with DirectX 11 in particular has increased quite a bit since its Cypress launch. As a result, tessellation performance sees a nice increase, although NVIDIA remains ahead in overall performance.
With the HD 6900, AMD’s design goals were to create a more efficient product, one that has massive geometry performance, improved image quality features and of course, is more power efficient. It accomplishes all this a couple of different ways.
First, a VLIW4 arrangement has been implemented. What’s that stand for, and what’s it mean? It stands for “Very Long Instruction Word” (thanks Charlie), and it refers to the overall organization of the components in the GPU. In gist, VLIW4 is more efficient than VLIW5, and was a necessary change thanks to the fact that AMD was working with a similar die size, but still trying to fit in more performance and features.
Here’s a pretty die diagram that we’ve all come to expect from AMD:
Moving to a VLIW4 configuration to VLIW5 in itself doesn’t automatically deliver a performance increase, but it has allowed AMD to make more efficient use of the space it’s given.
In addition to this change, back-end render engines have also been improved, with up to 4x performance in some calculation processes, and not surprisingly, GPU compute performance has also been tackled. AMD has introduced asynchronous dispatch to allow each computer kernel its own command queue in a protected virtual address domain, which is in essence improved threading, and should prove quite beneficial overall.
One of the bigger changes is the move to dual graphics engines, allowing two primitives to be processed per clock, which is one of the bigger reasons tessellation performance has been improved, by up to ~3x. You can see this exampled in this slide:
You might notice that instead of comparing the HD 6970 to the GTX 480 or 580 for tessellation performance, AMD has compared it to the HD 5870 instead. This could be considered proof that NVIDIA’s tessellation performance is still quite a bit ahead, as it should be given the amount of development time NVIDIA has sunk into Fermi’s computational performance.
With the HD 6800 series, AMD introduced a new anti-aliasing mode called Morphological AA, which differed from typical AA modes in that it acts as a post-processor. With the HD 6900 series, the company has once again improved on its anti-aliasing options by offering “EQAA”, a mode that doubles the number of coverage samples from MSAA, as seen here:
We haven’t had the chance to delve deeper into the real improvements to be gained with EQAA, but it’s something we’d like to tackle soon, in addition to Morphological AA, as we haven’t dealt too deeply with that up to this point either.
What might be the most significant change to the architecture of the HD 6900 series is the introduction of “PowerTune”, a power-related technology that assures the GPU in question doesn’t exceed its set max wattage (TDP) value. In the case of the HD 6970, which features a TDP of 250W, PowerTune would keep that value in check during use, and not exceed it which is becoming so common of GPUs today.
PowerTune works by employing an integrated control processor to monitor the current (no pun) power draw of the card, and when things are getting a bit too heated, it will decrease the clocks in order to keep the TDP from going over. Due to an efficient design, AMD states that this throttling shouldn’t result in a real noticeable performance drop, and in our tests, that’s not something we exhibited.
Because overclockers tend to push TDP levels far past recommended values, AMD includes a feature in its OverDrive utility that will allow you to either decrease the max TDP by 20%, or increase it by 20%. Why would you ever lower it? For HTPC use, and things like that, when you want things to be as low power as possible.
This tech is similar to the introduced by NVIDIA with the GeForce GTX 580. It’s good in theory, but makes it more difficult for reviewers to acquire “top” wattage draw values. Programs like OCCT and Furmark are rendered useless, so we’re in the process of replacing our power-stress test with something that should still deliver good results. We’re evaluating 3DMark 11, as it makes heavy use of multiple things that pushes a GPU hard.
We’ve been seeing “Dual BIOSes” on select motherboards for some times, because for overclockers, they act as a great failsafe. But, overclockers tackle GPUs also, so what’s the deal with the lack of a dual BIOS there? Well, there isn’t a lack, any longer. Note a brand-new switch:
The mechanics of this are simple. For normal use, the BIOS can be in the default position, and for flashing or insane overclocks, the secondary BIOS can be used. If something goes awry, you can simply switch to the safe BIOS in order to fix the second.
There are other minor additions to the HD 6900 series, but for the sake of time I can’t tackle them all. The last notable thing I’ll mention is that this series introduces the 5th generation “vapor chamber” GPU cooler design, although it’s not clear what changes have been made. The cooler is rather large as a result of the card being large, however, so overall, we should be able to expect some decent performance there.
We’ll talk a bit about our testing system and methodologies on the following page, and then will get to a look at the performance.
At Techgage, we strive to make sure our results are as accurate as possible. Our testing is rigorous and time-consuming, but we feel the effort is worth it. In an attempt to leave no question unanswered, this page contains not only our testbed specifications, but also a detailed look at how we conduct our testing.
The below table lists our testing machine’s hardware, which remains unchanged throughout all GPU testing, minus the graphics card. Each card used for comparison is also listed here, along with the driver version used. Each one of the URLs in this table can be clicked to view the respective category on our site for that product.
Intel Core i7-975 Extreme Edition – Quad-Core @ 4.05GHz – 1.40v
Gigabyte GA-EX58-EXTREME – F13j BIOS (08/02/2010)
Corsair DOMINATOR – 12GB DDR3-1333 7-7-7-24-1T, 1.60v
|ATI Graphics|| Radeon HD 6970 2GB (Reference) – Catalyst 10,12 Beta|
Radeon HD 6950 2GB (Reference) – Catalyst 10.12 Beta
Radeon HD 6870 1GB (Reference) – Catalyst Oct 5, 2010 Beta
Radeon HD 6850 1GB (Sapphire Toxic) – Catalyst 10.11
Radeon HD 6850 1GB (Reference) – Catalyst Oct 5, 2010 Beta
Radeon HD 5870 1GB (Sapphire) – Catalyst 10.8
Radeon HD 5850 1GB (ASUS) – Catalyst 10.8
Radeon HD 5830 1GB (Reference) – Catalyst 10.8
Radeon HD 5770 1GB (Sapphire FleX) – Catalyst 10.9
Radeon HD 5770 1GB (Reference) – Catalyst 10.8
Radeon HD 5750 1GB (Sapphire) – Catalyst 10.8
|NVIDIA Graphics|| GeForce GTX 480 1536MB (Reference) – GeForce 260.63|
GeForce GTX 470 1280MB (EVGA) – GeForce 260.63
GeForce GTX 460 1GB (EVGA) – GeForce 260.63
GeForce GTX 450 1GB (ASUS) – GeForce 260.63
Gateway XHD3000 30″
When preparing our testbeds for any type of performance testing, we follow these guidelines:
To aide with the goal of keeping accurate and repeatable results, we alter certain services in Windows 7 from starting up at boot. This is due to the fact that these services have the tendency to start up in the background without notice, potentially causing inaccurate test results. For example, disabling “Windows Search” turns off the OS’ indexing which can at times utilize the hard drive and memory more than we’d like.
The most important services we disable are:
The full list of Windows services we assure are disabled is large, but for those interested in perusing it, please look here. Most of the services we disable are mild, but we go to such an extent to have the PC as highly optimized as possible.
At this time, we benchmark with three resolutions that represent three popular monitor sizes available today, 20″ (1680×1050), 24″ (1920×1080) and 30″ (2560×1600). Each of these resolutions offers enough of a variance in raw pixel output to warrant testing with it, and each properly represent a different market segment: mainstream, mid-range and high-end.
Because we value results generated by real-world testing, we don’t utilize timedemos. The possible exceptions might be Futuremark’s 3DMark Vantage and Unigine’s Heaven 2.1. Though neither of these are games, both act as robust timedemos. We choose to use them as they’re a standard where GPU reviews are concerned.
All of our results are captured with the help of Beepa’s FRAPS 3.2.3, while stress-testing and temperature-monitoring is handled by OCCT 3.1.0 and GPU-Z, respectively.
For those interested in the exact settings we use for each game, direct screenshots can be seen below:
It’s not that often that faithful PC gamers get a proper racing game for their platform of choice, but Dirt 2 is one of those. While it is a “console port”, there’s virtually nothing in the game that will make that point stand out. The game as a whole takes good advantage of our PC’s hardware, and it’s as challenging as it is good-looking.
Manual Run-through: The race we chose to use in Dirt 2 is the first one available in the game, as it’s easily accessible and features a lot of GPU-pounding effects that the game has become known for, such as realistic dust and water effects, a large on-looking crowd of people and fine details on and off the track. Each run-through lasts the entire two laps, which comes out to about 2.5 minutes.
I admit that I expected a bit more out of both HD 6900 cards here. The HD 5870 managed to out-pace the HD 6950, and likewise, NVIDIA’s GTX 570 proved about 10% faster than the HD 6970 at 1920×1080. Let’s see if this trend continues.
Just Cause 2 might not belong to a well-established series of games, but with its launch, it looks like that might not be the case for long. The game offers not only superb graphics, but an enormous world to explore, and for people like me, a countless number of hidden items to find around it. During the game, you’ll be scaling skyscrapers, racing through jungles and fighting atop snow-drenched mountains. What’s not to like?
Manual Run-through: The level chosen here is part of the second mission in the game, “Casino Bust”. Our runthrough begins at the second-half of the level, which requires us to situate ourselves on top of a car and have our driver, Karl Blaine, speed us through part of the island to safety. This is a great mission for benchmarking as we get to see a lot of the landmass, even if some of it is at a distance.
Both Dirt 2 and Just Cause 2 tend to favor AMD’s cards over NVIDIA’s, but while AMD fell slightly behind where we expected it to be with Dirt 2, it came ahead in Just Cause 2. Here, rather than NVIDIA’s GTX 570 coming ahead of AMD’s HD 6970, the latter came ahead of NVIDIA’s flagship, the GTX 580.
For fans of the original Mafia game, having to wait an incredible eight years for a sequel must’ve been tough. But as we found out in our review, the wait might be forgotten as the game is quite good. It doesn’t feature near as much depth as say, Grand Theft Auto IV, but it does a masterful job of bringing you back to the 1940’s and letting you experience the Mafia lifestyle.
Manual Run-through: Because this game doesn’t allow us to save a game in the middle of a level, we chose to use chapter 7, “In Loving Memory…”, to do our runthrough. That chapter begins us on a street corner with many people around, and from there, we run to our garage, get in our car, and speed out to the street. Our path ultimately leads us to the park, and takes close to two minutes to accomplish.
Being that Mafia II was built with NVIDIA cards in mind, it’s of little surprise to see those cards out-perform AMD’s all-around. In order for AMD to compete, dual GPUs need to be used. Performance is still good all-around, though, from both teams. A single HD 6950 has even managed to out-perform the HD 6850 in CrossFireX.
One of the more popular Internet memes for the past couple of years has been, “Can it run Crysis?”, but as soon as Metro 2033 launched, that’s a meme that should have died. Metro 2033 is without question one of the beefiest games on the market, and though it supports DirectX 11, it’s almost a feature worth ignoring, because the extent you’ll need to go to in order to see playable framerates isn’t likely going to be worth it.
Manual Run-through: The level we use for testing is part of chapter 4, called “Child”, where we must follow a linear path through multiple corridors until we reach our end point, which takes a total of about 90 seconds. Please note that due to the reason mentioned above, we test this game in DX10 mode, as DX11 simply isn’t that realistic from a performance standpoint.
AMD’s offerings strike back here, which again is to be expected when the trend has been that Radeons have excelled in this particular game since its release. In the HD 6970 vs. GTX 580 argument, both cards perform just about the same. That’s no joke… both cards had the same minimum FPS rating, and were 0.019 apart for the averages at 1920×1080. That’s what I call close.
Of all the games we test, it might be this one that needs no introduction. Back in 1998, Blizzard unleashed what was soon to be one of the most successful RTS titles on the planet, and even as of today, the original is still heavily played all around the world – even in actual competitions. StarCraft II of course had a lot of hype to live up to, and it did, thanks to its intense gameplay and superb graphics.
Manual Run-through: The portion of the game we use for testing is part of the Zero Hour mission, which has us holding fort until we’re able to evacuate. Our saved game starts us in the middle of the mission, and from the get-go, we build a couple of buildings and concurrently move our main units up and around the map. Total playtime lasts about two minutes.
StarCraft II also seems to be an NVIDIA-bound game, and that’s evidenced by the results here. But at 2560×1600, AMD’s cards did manage to pull ahead, possibly due to its large 2GB buffer.
Although we generally shun automated gaming benchmarks, we do like to run at least one to see how our GPUs scale when used in a ‘timedemo’-type scenario. Futuremark’s 3DMark 11 is without question the best such test on the market, and it’s a joy to use, and watch. The folks at Futuremark are experts in what they do, and they really know how to push that hardware of yours to its limit.
Similar to a real game, 3DMark 11 offers many configuration options, although many (including us) prefer to stick to the profiles which include Performance, and Extreme. Depending on which one you choose, the graphic options are tweaked accordingly, as well as the resolution. As you’d expect, the better the profile, the more intensive the test. The benchmark doesn’t natively support 2560×1600, so to benchmark with that, we choose the Extreme profile and simply change the resolution.
Although the GTX 580 and GTX 570 either out-performed or matched the performance of the HD 6970 in our tests, 3DMark 11 shows that the HD 6970 performs almost just as well as the GTX 580. This is quite interesting, since it could prove that AMD’s cards are capable of much more. The potential problem? Drivers that aren’t quite refined enough, perhaps.
While Futuremark is a well-established name where PC benchmarking is concerned, Unigine is just beginning to become exposed to people. The company’s main focus isn’t benchmarks, but rather its cross-platform game engine which it licenses out to other developers, and also its own games, such as a gorgeous post-apocalytic oil strategy game. The company’s benchmarks are simply a by-product of its game engine.
The biggest reason that the company’s “Heaven” benchmark grew in popularity rather quickly is that both AMD and NVIDIA promoted it for its heavy use of tessellation, a key DirectX 11 feature. Like 3DMark Vantage, the benchmark here is overkill by design, so results here aren’t going to directly correlate with real gameplay. Rather, they showcase which card models can better handle both DX11 and its GPU-bogging features.
In comparing the HD 6900 series to the HD 6800 series, it’s clear that there’s been some major improvements made where geometry performance is concerned. Comparing just the GTX 570 and HD 6970, which are priced almost the same, the performance is near-identical.
To test our graphics cards for both temperatures and power consumption, we utilize OCCT for the stress-testing, GPU-Z for the temperature monitoring, and a Kill-a-Watt for power monitoring. The Kill-a-Watt is plugged into its own socket, with only the PC connect to it.
As per our guidelines when benchmarking with Windows, when the room temperature is stable (and reasonable), the test machine is boot up and left to sit at the desktop until things are completely idle. Because we are running such a highly optimized PC, this normally takes one or two minutes. Once things are good to go, the idle wattage is noted, GPU-Z is started up to begin monitoring card temperatures, and OCCT is set up to begin stress-testing.
To push the cards we test to their absolute limit, we use OCCT in full-screen 2560×1600 mode, and allow it to run for 15 minutes, which includes a one minute lull at the start, and a four minute lull at the end. After about 5 minutes, we begin to monitor our Kill-a-Watt to record the max wattage.
Note:Due to power-related changes AMD has made to its HD 6900 series, and NVIDIA to its GTX 500 series, we cannot run OCCT for the sake of stress-testing. As a result, we have opted to use 3DMark Vantage’s Test 2 (space flight) to get some metrics until we’re able to re-test the entire suite with the updated method.
There isn’t much point in discussing the power chart until we’re able to re-benchmark all of our cards going the non-OCCT route, but for thermals, AMD’s latest cards performed quite well. The HD 6950, despite featuring an identical cooler as the HD 6970, topped out at 71°C. Not bad at all!
AMD’s Radeon HD 6900 cards are interesting. It’s hard to use a word other than that, because long before we received our samples, I had an idea of what to expect performance-wise, and that doesn’t quite align with what we’ve seen throughout our tests. When AMD compared the HD 6970 to the GTX 570, I had expected to see it far out-perform it, but again, things changed from test to test.
These cards are not super high-end as I expected them to be, and NVIDIA’s GTX 580 for the most part is quite safe being right where it is. At the same time, what AMD did deliver isn’t bad at all, even if the cards didn’t deliver quite what I expected them to.
At $299, the HD 6950 is in the unique position where there is not direct competition, so all in all, it’s kind of a strange release, but one that still has reason to exist. It costs about $60 more than the HD 6870, and delivers a fair speed bump to warrant it. For those looking for a higher-end AMD offering, but don’t quite want to shell out $370 for the HD 6970, the HD 6950 deserves some consideration.
The HD 6970 is a little bit harder to conclude on, but for all intents and purposes, it performs about the same as NVIDIA’s GTX 570. Both flip/flop performance places with each other, and both cost about the same (AMD’s offering costs $20 more at the time of writing). So which to choose? That depends.
We seem to tackle the same pros and cons with each graphics card review, but it’s because both AMD and NVIDIA do offer unique feature-sets, so they must be taken into consideration. NVIDIA of course has better PhysX support than AMD, and also more attractive GPGPU performance. The same can be said for geometry performance. NVIDIA sunk a lot of R&D into making Fermi beat out AMD in this regard, and it has paid off.
On the AMD side, the biggest draw to me is multi-display support. To go multi-display with NVIDIA, you need a minimum of two graphics cards, and given that even mid-range cards today pack in some serious performance, it’s unfortunate to be pushed into the dual-GPU route even if you feel like you don’t need it.
AMD on the other side allows up to six monitors to be connected to a single card, and that’s hard to ignore if you plan on a 3×1 or larger setup. As we’ve seen before, a ~$300 card can handle huge resolutions that 3×1 can avail no problem at all, so to have the option of using just one card is nice.
Though we didn’t see major advantages of it in our tests, another plus of AMD’s offerings is that they both feature 2GB of GDDR5, which to me is rather significant. We’ve been nearing a time when having a much larger buffer can prove beneficial, so AMD’s cards seem to be better future-proofed in that regard for those gaming at huge resolutions (or use detail settings that require a lot of memory).
The thing that strikes me about both of AMD’s latest cards is that it feels like we’re dealing with early samples, and our 3DMark 11 run kind of backs up that fact. While the HD 6970 couldn’t match the GTX 580 in our actual gaming tests, it came close to it in our 3DMark run. I think the drivers need some time to mature, because even though the HD 6900 series is based on previous architectures, a lot of changes have been made, so it seems likely that with some time, AMD can improve the performance on these cards quite significantly within the next six months.
Let’s see if that happens, though. For now, both of AMD’s offerings here are interesting and well worth considering. The prices are right, and so are the feature-sets and performance. For those interested in a look at CrossFireX performance, you can expect to see that from us in the days ahead.
Have a comment you wish to make on this article? Recommendations? Criticism? Feel free to head over to our related thread and put your words to our virtual paper! There is no requirement to register in order to respond to these threads, but it sure doesn’t hurt!
Copyright © 2005-2020 Techgage Networks Inc. - All Rights Reserved.