Blender 3.6 Performance Deep-dive: GPU Rendering & Viewport Performance

Blender 3.6 - Pet Projects Scene (Thumbnail)
Print
by Rob Williams on July 26, 2023 in Graphics & Displays

Blender 3.6 has recently launched, and from a performance perspective, it’s one of the biggest releases in a while. Both AMD and Intel gain ray tracing acceleration to render your scene faster, and other subtle improvements have been made as well, including the speed of viewport shader compiling. Let’s dig in!

Blender’s long-awaited 3.6 version has recently launched, and while this article is coming later than we planned (it’s been a very busy summer), we’ve been ridiculously eager to talk about some of the performance changes that have come along with it.

As with all new Blender versions, 3.6 introduces a wide range of new features and added polish, but above all, the feature that stands out to us most is ray tracing acceleration for AMD and Intel graphics cards.

When we talk of “ray tracing acceleration”, we’re talking about improvements made to the general rendering pipeline to help a given GPU architecture render samples faster. With NVIDIA, we saw OptiX added to Blender about four years ago, and from the get-go, we could see that the benefits dedicated RT cores could bring were immense. So, it’s finally good to see HIP-RT making it to Blender for AMD, and Embree GPU for Intel.

Blender 3.6 - Pet Projects Scene

We’re going to keep this intro short and sweet so that we can tackle the performance quicker, but for all of what’s new with Blender 3.6, definitely check out the official release page.

AMD, Intel & NVIDIA GPU Lineups, Our Test Methodologies

Before diving into our performance results, here’s a quick look at AMD’s, Intel’s, and NVIDIA’s current product lineups, as well as some test methodology housekeeping:

AMD Radeon Lineup (As Of RX 7600 Launch)
Intel Arc Lineup (as of A770 Launch)
NVIDIA GeForce Lineup (As Of RTX 4060 Launch)

These lineup shots include both current- and last-gen GPUs from AMD and NVIDIA, but for the sake of keeping charts less busy, we’re focusing on only the current-gen products for this article. NVIDIA’s Ada Lovelace generation appears complete until SUPER or Ti cards come along. AMD’s current lineup involves previous-gen cards, as most models have yet to be superseded.

Techgage Intel Core i9-13900K Workstation PC
Techgage Creator GPU Testing PC
Processor Intel Core i9-13900K (3.0GHz, 24C/32T)
Motherboard ASUS ROG STRIX Z690-E GAMING WIFI
CPUs tested with 2305 BIOS (March 10, 2023)
Memory G.SKILL Trident Z5 RGB (F5-6000J3040F16G) 16GB x2
XMP-enabled w/ freq. set to DDR5-6000 (30-40-40-96, 1.35V)
AMD Graphics AMD Radeon RX 7900 XTX (24GB; Adrenalin 23.7.1)
AMD Radeon RX 7900 XT (20GB; Adrenalin 23.7.1)
AMD Radeon RX 7600 (8GB; Adrenalin 23.7.1)
AMD Radeon RX 6950 XT (16GB; Adrenalin 23.7.1)
AMD Radeon RX 6800 XT (16GB; Adrenalin 23.7.1)
AMD Radeon RX 6800 (16GB; Adrenalin 23.7.1)
AMD Radeon RX 6750 XT (12GB; Adrenalin 23.7.1)
AMD Radeon RX 6500 XT (4GB; Adrenalin 23.7.1)
Intel Graphics Intel Arc A770 (16GB; Arc 31.0.101.4514)
Intel Arc A750 (8GB; Arc 31.0.101.4514)
Intel Arc A380 (6GB; Arc 31.0.101.4514)
NVIDIA Graphics NVIDIA GeForce RTX 4090 (24GB; Studio 536.40)
NVIDIA GeForce RTX 4080 (16GB; Studio 536.40)
NVIDIA GeForce RTX 4070 Ti (12GB; Studio 536.40)
NVIDIA GeForce RTX 4070 (12GB; Studio 536.40)
NVIDIA GeForce RTX 4060 Ti (8GB; Studio 536.40)
NVIDIA GeForce RTX 4060 (8GB; Studio 536.40)
Storage WD Blue 3D NAND 1TB x3 (SATA 6Gbps)
Power Supply Corsair RM1000x (1000W)
Chassis Corsair 4000X Mid-tower
Cooling Corsair H150i ELITE CAPELLIX (360mm)
Et cetera Windows 11 Pro 22H2, Build 22621
Intel Chipset Driver: 10.1.19199.8340
Intel ME Driver: 2306.4.10.0
All product links in this table are affiliated, and help support our work.

All of the benchmarking conducted for this article was completed using an up-to-date Windows 11 (22H2), the latest Intel chipset driver, as well as the latest (as of the time of testing) graphics driver.

Here are some general guidelines we follow:

  • Disruptive services are disabled; eg: Search, Cortana, User Account Control, Defender, etc.
  • Overlays and / or other extras are not installed with the graphics driver.
  • Vsync is disabled at the driver level.
  • OSes are never transplanted from one machine to another.
  • We validate system configurations before kicking off any test run.
  • Testing doesn’t begin until the PC is idle (keeps a steady minimum wattage).
  • All tests are repeated until there is a high degree of confidence in the results.

Note that all of the rendering projects tested for this article can be downloaded straight from Blender’s own website. Default values for each project have been left alone, so you can set your render device, hit F12, and compare your render time in a given project to ours.

Viewport Shader Compile Times

In our Blender 3.5 performance deep-dive, we explored the performance of shader compiling when enabling the Material Preview mode in the viewport. The raw performance of a GPU doesn’t matter too much with this task. Instead, driver and software optimizations do.

What we found in our previous testing was that reboots wouldn’t retain the entire shader cache, so enabling Material Preview would take longer than expected. Well, this is a bug that Intel fixed, and bonus: it affects all vendors. Let’s see the latest performance:

Controller First Repeat After Reboot
AMD RX 7900 XTX (3.5) 9 s 4 s 9 s
AMD RX 7900 XTX (3.6) 9 s 4 s 4 s
Intel Arc A770 (3.5) 23 s 2 s 20 s
Intel Arc A770 (3.6) 23 s 2 s 2 s
NVIDIA RTX 4090 (3.5) 21 s 2 s 14 s
NVIDIA RTX 4090 (3.6) 21 s 2 s 2 s
Barbershop First Repeat After Reboot
AMD RX 7900 XTX (3.5) 54 s 24 s 55 s
AMD RX 7900 XTX (3.6) 54 s 24 s 24 s
Intel Arc A770 (3.5) 234 s 12 s 186 s
Intel Arc A770 (3.6) 230 s 12 s 11 s
NVIDIA RTX 4090 (3.5) 137 s 3 s 101 s
NVIDIA RTX 4090 (3.6) 137 s 3 s 3 s
Classroom First Repeat After Reboot
AMD RX 7900 XTX (3.5) 21 s 7 s 21 s
AMD RX 7900 XTX (3.6) 21 s 7 s 7 s
Intel Arc A770 (3.5) 64 s 4 s 59 s
Intel Arc A770 (3.6) 64 s 4 s 4 s
NVIDIA RTX 4090 (3.5) 43 s 2 s 31 s
NVIDIA RTX 4090 (3.6) 44 s 2 s 2 s

It’s so great to see that post-reboot shader compiles are effectively just as fast to execute as it would be if you just reopened Blender.

This is one of those tests where the performance from a given vendor is going to depend a bit on the project. Overall, though, NVIDIA proves faster with repeated shader compiles, but initial compiles are quickest with AMD GPUs. Intel’s initial compiles take longer than the other vendors, but the repeated compiles are super quick.

Cycles GPU: AMD HIP, Intel oneAPI & NVIDIA OptiX

As mentioned above, Blender 3.6 introduces ray tracing acceleration for both AMD Radeon and Intel Arc cards. AMD gains it via HIP-RT, and Intel via Embree. Embree isn’t new to Blender, but the GPU support is.

In all of our testing, Intel’s RT implementation is more stable than AMD’s. It feels stable enough to rely on Arc right now, but since some projects failed to render with HIP-RT, we can’t say the same about AMD at the moment. That said, we invite users to provide us feedback so we can better gauge this.

The below graphs show the performance differences between having RT on and off. We’re leaving the Secret Deer project for last, as it’s one that failed to render with HIP-RT. That said, the performance upticks are downright notable:

Blender - Cycles GPU Rendering Performance (Scanlands)
Blender - Cycles GPU Rendering Performance (White Lands)
Blender - Cycles GPU Rendering Performance (Secret Deer)

Not every project will magically show a massive rendering speed gain from RT acceleration, but when the stars align, the improvements are incredible. Even the lowbie AMD Radeon RX 6500 XT and Intel Arc A380 saw dramatic performance improvements in the Secret Deer and Scanlands projects.

Considering the fact that Blender gained RT acceleration for NVIDIA four years ago, it is really great to finally see the feature added for other vendors in Blender 3.6. We believe Intel’s take to be stable enough for real use, but for AMD, it might be best to wait for Blender 4.0 (due November).

Eevee GPU: AMD, Intel & NVIDIA

Before tackling Eevee performance, we should note that the RT acceleration noted above does not impact Eevee. That means each vendor has a unique opportunity to impress, since raw performance and driver optimizations will matter a lot. Let’s check it out:

Blender - Eevee GPU Rendering Performance (Red Autumn Forest)
Blender - Eevee GPU Rendering Performance (Splash Fox)
Blender - Eevee GPU Rendering Performance (Charge)

With Eevee, NVIDIA still proves the fastest overall, but depending on the project, AMD can perform really well, too. Unfortunately, the same can’t be said for Intel right now. We’re in bad need of either Blender or driver optimizations to raise the set of Arc cards from the bottom of these charts.

Viewport: Material Preview, Solid & Wireframe

Viewport performance is unchanged between Blender 3.5 and 3.6, so most of this data is returning from the previous deep-dive. GPUs that have released since that last deep-dive have been added in, however:

Blender - 2160p Material Preview Performance (Controller)
Blender - 1080p Material Preview Performance (Controller)
Blender - 2160p Material Preview Performance (Barbershop)
Blender - 1080p Material Preview Performance (Barbershop)
Blender - 2160p Material Preview Performance (PartyTug)
Blender - 1080p Material Preview Performance (PartyTug)
Blender - 2160p Material Preview Performance (Charge)
Blender - 1080p Material Preview Performance (Charge)

As with most scalable tests, it’s hard to predict which vendor will come out on top, and that’s pretty much the case for viewport performance. While NVIDIA’s GeForce RTX 4090 is hard to topple, AMD beats it out on some occasions with its top-end 7900 XTX.

Viewport performance scales pretty well up and down any given GPU lineup, but for the smoothest possible performance, you’d want to aim for a mid-range card at least. Unfortunately, viewport performance is another area Intel needs work on – the Material Preview performance is okay, but the Solid and Wireframe modes are in need of optimization:

Blender - 2160p Solid Performance (Controller)
Blender - 2160p Wireframe Performance (Controller)

For vendors other than Intel, it’s not difficult at all to get great Wireframe and Solid viewport performance.

Wrapping Up

After poring over all of the data here, Blender 3.6 strikes us one of the biggest updates to drop in a while, but that’s not to downplay how major each release feels. The addition of ray tracing acceleration for AMD and Intel can dramatically speed-up render times, so we’re glad to see the feature implemented.

Also nice is the fix that was implemented for Material Preview shader compiles after reboots. With each new Blender release, more and more polish results in continual performance improvements.

Blender - Pet Project Scene

In November, Blender 4.0 is going to launch, and it’s set to bring an impressive amount of improvement, including for performance. We’re still months out, so we’re not sure all of what will make it in, but like always, we look forward to digging in!

Support our efforts! With ad revenue at an all-time low for written websites, we're relying more than ever on reader support to help us continue putting so much effort into this type of content. You can support us by becoming a Patron, or by using our Amazon shopping affiliate links listed through our articles. Thanks for your support!

Rob Williams

Rob founded Techgage in 2005 to be an 'Advocate of the consumer', focusing on fair reviews and keeping people apprised of news in the tech world. Catering to both enthusiasts and businesses alike; from desktop gaming to professional workstations, and all the supporting software.

twitter icon facebook icon instagram icon