AMD Radeon ProRender: GPU, Multi-GPU & CPU+GPU Rendering Performance

AMD Radeon ProRender
Print
by Rob Williams on October 27, 2018 in Graphics & Displays

It’s been some time since we’ve dug deep into the performance of AMD’s Radeon ProRender, so with the latest version now supporting heterogeneous rendering, the time is right to get to business. Included within is performance of 15 GPU configurations, including multi-GPU, as well as CPU+GPU.

Things might be quiet on the gaming side of Radeon lately, but that’s hardly the case with the workstation side. There, the team seems to be hellbent on not only making sure the company’s Radeon Pros and ProRender solutions are as good as they can be, they also want people to know they exist to begin with, and know what they can do.

I give AMD props for its efforts here. I follow a few Radeon Pro folk on Twitter, and am regularly treated to some stellar examples of what can be done with ProRender. I don’t even need to seek this stuff out – it’s right there for the taking, because the people behind the technologies are so proud of what’s being done.

When I was at SIGGRAPH in August, I made it a point to ask every designer I came into contact with if they had ever used ProRender, and unfortunately, the amount of people who told me that they’ve never even heard of it was almost disappointing. Clearly, ProRender has become quite refined over the short amount of time it’s existed, and it’s well worth using, but this is a super-hard market to crack. With constant development, and an open-source nature, I do see things improving a lot as time goes on.

AMD Radeon ProRender in 3ds Max 2019
AMD Radeon ProRender in Autodesk 3ds Max

That said, I took an in-depth look at what ProRender could do in 3ds Max back in April. Not even one full month later, the company released a new version of the plugin that delivered notable performance boosts. Not long after that, yet another feature-packed version came out – this one supporting heterogeneous rendering (CPU+GPU). Having tested heterogeneous (what a fun word!) rendering in Chaos Group’s V-Ray, I can say that the gains can sometimes be (legitimately) jaw-dropping. As long as a beefy CPU is the complement to the GPU, at least. And assuming the renderer does its job well.

For this performance look, I’m going to be testing with five Radeon Pro and four Quadro cards for the workstation side, and for the gaming side, AMD has two present, while NVIDIA has three – the third being the brand-new Turing-based RTX 2080 Ti. The goal here isn’t to include only the proper workstation cards, but also the top gaming card available from each vendor to see how things fare on those. With NVIDIA, the TITAN Xp is used in addition to the 1080 Ti because it has additional workstation optimizations not found elsewhere in the GeForce stack.

Here’s an overall view of both company’s current graphics lineups (tested GPUs italicized):

AMD’s Radeon Pro Workstation GPU Lineup
CoresBase MHzPeak FP32MemoryBandwidthTDPPrice
SSG4096144012.3 TFLOPS16 GB 8484 GB/s260W$6999
WX 91004096120012.3 TFLOPS16 GB 8484 GB/s230W$1399
WX 82003584120010.8 TFLOPS8 GB 8512 GB/s230W$999
Frontier4096138213.1 TFLOPS16 GB 4484 GB/s300W$499
Pro Duo2304 x212435.7 TFLOPS32 GB 3448 GB/s250W$449
WX 7100230411885.73 TFLOPS8 GB 3224 GB/s130W$549
WX 510017927133.89 TFLOPS8 GB 3160 GB/s75W$359
WX 4100102411252.46 TFLOPS4 GB 396 GB/s50W$259
WX 31005129251.25 TFLOPS4 GB 396 GB/s50W$169
WX 21005129251.25 TFLOPS2 GB 356 GB/s50W$129
Notes1 GDDR6; 2 GDDR5X; 3 GDDR5; 4 HBM2
5 GDDR6 (ECC); 6 GDDR5X (ECC); 7 GDDR5 (ECC); 8 HBM2 (ECC)
Architecture: WX 2100~7100 = Polaris; WX 8200, 9100 & SSG = Vega
AMD’s Radeon Gaming GPU Lineup
CoresBase MHzPeak FP32MemoryBandwidthTDPPrice
Vega 644096124710.2 TFLOPS8 GB 4484 GB/s295W$499
Vega 56358411568.28 TFLOPS8 GB 4410 GB/s210W$449
RX 580230412575.79 TFLOPS8 GB 3256 GB/s185W$229
RX 570204811684.78 TFLOPS8 GB 3224 GB/s150W$179
RX 56089611751.95 TFLOPS4 GB 3112 GB/s80W$119
RX 55064011001.13 TFLOPS2 GB 3112 GB/s50W$99
Notes1 GDDR6; 2 GDDR5X; 3 GDDR5; 4 HBM2
Architecture: RX 550~580 = Polaris; RX Vega 56 & 64 = Vega
NVIDIA’s Quadro Workstation GPU Lineup
CoresBase MHzPeak FP32MemoryBandwidthTDPPrice
GV1005120120014.9 TFLOPS32 GB 8870 GB/s185W$8,999
RTX 80004608144016.3 TFLOPS48 GB 5624 GB/s???W$10,000
RTX 60004608144016.3 TFLOPS24 GB 5624 GB/s295W$6,300
RTX 50003072135011.2 TFLOPS16 GB 5870 GB/s265W$2,300
TITAN V5120120014.9 TFLOPS12 GB 4653 GB/s250W$2,999
P60003840141711.8 TFLOPS24 GB 6432 GB/s250W$4,999
P5000256016078.9 TFLOPS16 GB 6288 GB/s180W$1,999
P4000179212275.3 TFLOPS8 GB 3243 GB/s105W$799
P2000102413703.0 TFLOPS5 GB 3140 GB/s75W$399
Notes1 GDDR6; 2 GDDR5X; 3 GDDR5; 4 HBM2
5 GDDR6 (ECC); 6 GDDR5X (ECC); 7 GDDR5 (ECC); 8 HBM2 (ECC)
Architecture: P = Pascal; V = Volta; RTX = Turing
NVIDIA’s GeForce Gaming GPU Lineup
CoresBase MHzPeak FP32MemoryBandwidthTDPPrice
RTX 2080 Ti4352135013.4 TFLOPS11GB 1616 GB/s250W$999
RTX 20802944151510.0 TFLOPS8GB 1448 GB/s215W$699
RTX 2070230414107.4 TFLOPS8GB 1448 GB/s175W$499
TITAN Xp3840148012.1 TFLOPS12GB 2548 GB/s250W$1,199
GTX 1080 Ti3584148011.3 TFLOPS11GB 2484 GB/s250W$699
GTX 1080256016078.8 TFLOPS8GB 2320 GB/s180W$499
GTX 1070 Ti243216078.1 TFLOPS8GB 3256 GB/s180W$449
GTX 1070192015066.4 TFLOPS8GB 3256 GB/s150W$379
GTX 1060128017004.3 TFLOPS6GB 3192 GB/s120W$299
GTX 1050 Ti76813922.1 TFLOPS4GB 3112 GB/s75W$139
GTX 105064014551.8 TFLOPS2GB 3112 GB/s75W$109
Notes1 GDDR6; 2 GDDR5X; 3 GDDR5; 4 HBM2
Architecture: GTX & TITAN = Pascal; RTX = Turing

And here’s the PC used for testing:

Techgage Workstation Test System
ProcessorIntel Core i9-7980XE (18-core; 2.6GHz)
MotherboardASUS ROG STRIX X299-E GAMING
MemoryHyperX FURY (4x16GB; DDR4-2666 16-18-18)
Graphics AMD Radeon RX Vega 64 8GB (18.10.1)
AMD Radeon RX 580 8GB (Red Devil; 18.10.1)
AMD Radeon Pro WX 8200 8GB (18.Q3.1)
AMD Radeon Pro WX 7100 8GB (18.Q3.1)
AMD Radeon Pro WX 5100 4GB (18.Q3.1)
AMD Radeon Pro WX 4100 4GB (18.Q3.1)
AMD Radeon Pro WX 3100 4GB (18.Q3.1)
NVIDIA GeForce RTX 2080 Ti 11GB (NVIDIA FE; 416.34)
NVIDIA TITAN Xp 12GB (416.34)
NVIDIA GeForce GTX 1080 Ti 11GB (416.34)
NVIDIA Quadro P6000 24GB (416.30)
NVIDIA Quadro P5000 16GB (416.30)
NVIDIA Quadro P4000 8GB (416.30)
NVIDIA Quadro P2000 5GB (416.30)
AudioOnboard
StorageKingston KC1000 960GB M.2 SSD
Power SupplyCorsair 80 Plus Gold AX1200
ChassisCorsair Carbide 600C Inverted Full-Tower
CoolingNZXT Kraken X62 AIO Liquid Cooler
Et ceteraWindows 10 Pro build 17763 (1809)
For an in-depth pictorial look at this build, head here.

There’s not too much else to note aside from the fact that the latest version of both Autodesk’s 3ds Max 2019 was used, as well as AMD’s Radeon ProRender 2.3.379 renderer. Here are (some) notes about our OS and test setup:

  • Windows and installed software is updated (and Windows Update is paused afterwards).
  • All testing is done using a 4K (3840×2160) desktop resolution and 1920×1080 render resolution.
  • G-SYNC, FreeSync, 3D, and screen timeouts are disabled.
  • Windows features disabled: Cortana, Firewall, Defender, Search, Event Log.
  • Most preinstalled Windows Store bloat is uninstalled, as is OneDrive.
  • Ultimate Performance power profile is used.
  • All non-essential services are disabled in Task Manager’s “Startup” tab.
  • Windows’ dark mode is enabled (now you know it exists!).
  • EFI configuration is default, as the memory defaults to XMP settings.

With all of that covered, let’s dive into the reason this article exists:

Radeon ProRender Performance Testing

AMD Radeon ProRender - Vehicle Scene
AMD Radeon ProRender GPU Performance

There’s a lot to dissect here, so let’s start from the top.

Clearly, NVIDIA has one hell of a good thing going with its Turing RTX GeForce graphics cards. The 2080 Ti dominates everything else in the chart – even the dual-GPU configuration. Technically, that shouldn’t be the case, and I feel like it wouldn’t be if ProRender used NVIDIA cards a little more efficiently. It could also be that NVIDIA’s own driver lacks ProRender optimizations; I’m not too sure. Either way, with NVIDIA’s placement in this chart, it’s not as though we’re being held back in comparison to the competition.

Even though TITAN Xp is a bit faster than a WX 9100 on the FP32 front, two of AMD’s top cards are likely to outperform two TITAN Xps, since ProRender is (unsurprisingly) optimized on Radeon Pro. The situation isn’t at all “bad” on the NVIDIA side, though. Back in April, dual TITAN Xps performed worse than a single TITAN Xp, so clearly, major strides have been made to improve that situation.

Ultimately, NVIDIA rules the roost here, hogging the top five spots. In sixth place is the Radeon RX Vega 64, which improves upon the WX 8200 a little bit. The new WX 8200 performs almost identically to how a Vega 56 would, and here, it doesn’t really fall too far behind the Vega 64.

For sanity’s sake, none of the Polaris-based WX cards, or the Quadro P2000, are in any way “ideal” for ProRender work. 644 seconds might not seem so bad for the GT-R render, but you must bare in mind that our tests are not done with production-level quality. If you want to get closer to that, you can multiply any one of these results by 25 (for 2500 iterations) to get a better idea of a realistic render time. That’d be ~268 minutes with a WX 7100, ~136 minutes with a Vega 64, or ~52 minutes with 2080 Ti.

I mentioned before that ProRender now supports heterogeneous rendering, so I couldn’t tackle performance and pretend it didn’t exist. When I first tested the feature a couple of months ago, I actually saw worse performance than I did when I only rendered using the GPU. Fast-forward to today, and the situation seems to have improved – at least a little bit.

AMD Radeon ProRender - Heterogeneous Rendering Performance

With the CPU introduced into the rendering process, some of the placements change up a little bit. The 2080 Ti drops from the first position to second, surpassed by the dual TITAN Xps. The Quadro P6000 managed to outperform the TITAN Xp, as well – another reversal from the original graph.

Given how some of the scaling changed after the CPU was added to the mix, it’s hard to figure out why it’s happening. Did the dual TITAN Xps somehow become more efficient with heterogeneous rendering? Why did a GPU like the 2080 Ti see vastly reduced performance in the GT-R render?

Here’s a direct comparison between GPU and CPU+GPU:

AMD Radeon ProRender - Heterogeneous Rendering vs GPU Rendering Performance

Here, some of the faults in testing can be seen, such as how the 2080 Ti managed to get worse performance once heterogeneous rendering was engaged. Meanwhile, TITAN Xp performance barely changed, but TITAN Xp x2 performance gained fairly significantly.

On the whole, ProRender doesn’t seem to love NVIDIA GPUs, but that’s not necessarily a fault of the renderer. It could be that NVIDIA itself could optimize performance in the driver, and I’m not sure the company is even concerned with ProRender performance right now. Not that the company has much to fuss over, since its GPUs are dominating these charts overall.

As we look at the chart from the bottom up, it becomes clear that the beefier the GPU, the smaller the advantage adding a CPU will give. That’s not the case with all renderers, and changed performance will of course depend on the choice of CPU. Our 18-core might exhibit gains in areas where an 8700K wouldn’t.

Conveniently, I did decide to test the 8700K with an RTX 2070, since that PC was close.

Battlefield 1 - Techgage Tested Settings (1) Battlefield 1 - Techgage Tested Settings (1)
GeForce RTX 2070 Render: 2m 52s
Battlefield 1 - Techgage Tested Settings (1)
Battlefield 1 - Techgage Tested Settings (2) Battlefield 1 - Techgage Tested Settings (2)
GeForce RTX 2070 + i7-8700K Render: 3m 25s
Battlefield 1 - Techgage Tested Settings (2)

In this test, the RTX 2070 managed to render the project in 2m 52s, whereas the heterogeneous render took 3m 25s, or about 19% longer. Clearly, this kind of rendering is not optimized for NVIDIA, but it is for AMD. Again, I am not sure why it’s so poor for NVIDIA, but it’s likely to improve in the future.

I hate to bring another renderer into this discussion, but to highlight the differences of performance scaling, I must. Here are the results from testing the same feature with V-Ray earlier this year (with the beta version, no less):

Chaos Group V-Ray 4 Heterogeneous Rendering Scaling

In the same system, the WX 3100 was 219% faster with the CPU+GPU render than it was with the GPU render. In V-Ray, the P2000 proved 448% faster with CPU+GPU over GPU only. Could this be the difference between a renderer which was developed from the start with CPU support, and one built from the start with only GPU support? Probably, but that’s not to discredit AMD’s efforts. I’ve already seen major improvement over the past six months, and I see nothing to suggest that things won’t keep improving.

Final Thoughts

When I sat down to figure out how I wanted to test ProRender in an updated test suite, I figured I’d give Autodesk’s Maya some love since I use 3ds Max in so many other places. I am not experienced with Max, but I am even less experienced with Maya, and unfortunately, headache after headache prevented me from testing with that.

For some reason, ProRender in Maya does not save most render settings, which means that I can’t very well automate the benchmarking unless I want to use the (very poor) default settings. I am also unable to render from the command line and have the render time print out on the outputted image, and when Maya takes many seconds to load, you can’t simply blanket record the time the entire process takes.

Nonetheless, ProRender isn’t identical across all design suites, but it’s definitely familiar from one to another. 3ds Max has no problem saving ProRender settings inside the project file itself. In Maya, you must manually export the settings in order to import them again later. It’s clunky.

AMD Radeon Pro WX 8200
AMD’s Radeon Pro WX 8200

I didn’t decide to rant a little bit about this hassle because I spent more than a dozen hours on it, but because with all of the praise I’ve heaped on ProRender since it came out, it should be clear that it still has some niggles. With heterogeneous rendering, NVIDIA cards don’t see much benefit (despite the opposite being true in V-Ray, or even with the Radeon cards), and then there’s odd issues like your settings not saving.

The focus of this article has revolved entirely around performance, and since this was a performance look, that’s probably fine. However, there’s a lot more to ProRender than just its performance. Why you’d want to check it out is because it’s free, open-source, and physically based. This OpenCL renderer isn’t just for Windows, either, but also Mac OS and Linux. Current supported suites include 3ds Max, Maya, SolidWorks, and even Blender. With Blender and ProRender, you’d have a 100% free and open-source PBR solution. That’s pretty awesome.

As I’ve said a few times, ProRender continues to get better over time. This is the third dedicated look Techgage has had of ProRender this year, and with the constant improvement being made, it seems likely that another performance look is only a few months off.

Rob Williams

Rob founded Techgage in 2005 to be an 'Advocate of the consumer', focusing on fair reviews and keeping people apprised of news in the tech world. Catering to both enthusiasts and businesses alike; from desktop gaming to professional workstations, and all the supporting software.

twitter icon facebook icon googleplus icon instagram icon