Updated performance with 2.80 final found here.
Tile sizes are not used in Blender’s upcoming Eevee renderer, but they remain just as important as ever for Cycles. That is, unless you already know which values you should be going with, which for many will be the case. But, to see the true effects of using different tile sizes in three different rendering modes, we decided to generate some numbers.
The tile size you choose simply dictates how big the rendering region is. On a small image, a big tile size will cover most of the image, whereas a smaller tile size will look like Cinebench’s tile rendering. It’s long been said that for CPU, 32×32 should be used, while 256×256 is fine for GPUs. Recently, someone at one of the GPU vendors told me that 512×512 is safe to use for modern GPUs, and lo and behold, there was actually an improvement to be seen over 256×256.
For CPUs, you never want to go above 32×32, at this table highlights. While you’ll be safe up until 128×128, you’re already losing performance; going higher is just asking for pain (which is why there are some dashes instead of numbers).
Blender 2.80 can take proper advantage of heterogeneous rendering, meaning that both the CPU and GPU can jump in on the action to get the job done quicker, which is well evidenced in the table above. Tile sizes for hybrid rendering should be treated the same as CPU tile sizes, so 16×16 or 32×32. Otherwise, the CPU will choke, while the GPU will be trying to work with it.
Tile size is fortunately something you don’t need to fuss over much: you just need to choose the right value for your chosen rendering device, and move on. I should note that only these three projects (BMW, Classroom, Pavilion) from the Blender demo files site would render without issue in all three modes.
GPU Rendering Performance
The BMW Blender project is almost iconic at this point. It’s not a complex scene, but it’s great acting as a quick benchmark to see how different CPUs and GPUs scale. The project will just render quicker than the other more complex Classroom and Pavilion scenes, which take even better advantage of ray tracing.
We can’t imagine that many people are going to be using multiple GPUs for Blender, but fortunately, if you were to, the Cycles renderer would see a major boost in performance. At the bottom end of the chart, some of the results are downright painful. Other GPUs down there really don’t cost too much, so a card like the Quadro P2000 or RX 570 should be considered bare minimum.
Consider the fact that all of these renders are single-frame, and it’s not even a high-resolution frame. The more detailed a scene, and the higher the resolution, the longer it’ll take to render. That fact can become painful when we’re talking about 4K resolution, and especially animation.
The Classroom project doesn’t change much with the scaling, although some projects fare better in the BMW project than this one, which is thanks to the fact that there’s a lot more going on in this scene. The multi-GPU config continues to perform amazingly well, while the bottom bunch of GPUs should almost be outright avoided for this type of work.
Whereas the P2000 seemed like a decent enough cut-off point with the BMW scene, its performance here doesn’t bode a ton of confidence. Here, the RX 580 and cards around it seem to offer the best value. Faster cards will of course continue to shave time off, but it’ll be up to you to decide the best use of your hard-earned money.
The Pavilion scene, like the Classroom one, is very complex compared to the BMW project. That helps us get slightly different pictures of GPUs sometimes, since not all projects render the same way. Fortunately for NVIDIA, the company’s Turing-based GeForces rule the roost, though the last-gen TITAN Xp gets some props for placing so high in every single test as well.
It’s important to note that the CPU in Cycles counts for a lot, as you’d expect given it originated as a CPU renderer. We’re going to be taking a look at heterogeneous rendering performance shortly, because if your project can take proper advantage, it can change your perspective. But first, a quick look at Eevee:
Eevee is the “Extra Easy Virtual Environment Engine”, but also the name of a Pokémon, which makes Google searching for Blender-specific queries a little more complicated. While Eevee is designed from the ground up with the GPU in mind, it doesn’t currently support multiple GPUs, which is why the dual TITAN Xp entry is missing.
While it doesn’t take advantage of multiple GPUs, Eevee is a crazy fast renderer. So fast, that in order to generate some scaling, we had to boost the sample count on the chosen project. You don’t need 1,000 samples, but it’ll probably act well enough for the final render.
Eevee’s claim to fame is that it’s going to greatly accelerate animation rendering, which is one of the reasons it was built to be mind-blowingly fast. Still, the faster your GPU, the faster test renders are going to be with Eevee. We got our test project from here.
In our benchmarking, we couldn’t get the Eevee renderer to use the CPU, but research has told us that animation would make far better use of the CPU than a straight single-frame render. Unfortunately, we’re benchmarkers, not designers, so we don’t exactly have a capable project floating around. If you have one that you would like to see represented in our testing, please reach out.
Heterogeneous Rendering Performance
When we found out that Blender 2.80 would be supporting heterogeneous rendering, we couldn’t help but jump for joy. When you can render to both the CPU and GPU at the same time, the performance gains can be downright amazing. That’s doubly true if you are using both a super-fast CPU and GPU. At the same time, you probably don’t want things to be too disjointed, but in our tests, the CPU can make a bigger difference to performance than GPU.
As mentioned before, Eevee can use both the CPU and GPU, but in straight rendering, our CPU was not touched at all (at least in our chosen project). Now, swapping 30 GPUs or so isn’t too terribly complicated, but swapping that many CPUs definitely is. So, for our hybrid tests, we chose 9 CPUs to include, with AMD bringing us sweet scaling from bottom to top, with Intel chiming in to round things out.
In every single one of these tests, using hybrid rendering dramatically improves performance. That’s even true with the modest 2400G quad-core, a $140 CPU. That said, we would never suggest you should choose that kind of CPU for creative workloads; it’s just that if you had one that you were rendering to, it’s not going to hold as much performance back as you’d think, when used with hybrid rendering. On its own, it’s slow.
For Cycles, a better CPU is quite obviously a better buy over a faster GPU, but again, with Eevee’s huge GPU focus, your needs could change in time. If you have both a decent CPU and GPU, you will have little to worry about.
To give a better idea of just how important a CPU can be in Cycles, here’s a look at the 18-core Intel i9-7980XE combined with every single GPU tested:
As you can see, this 18-core CPU is so fast, that the GPU doesn’t matter nearly as much. But, if you happen to have a fast GPU as well, then your overall renders are going to take place far quicker. And again, we feel compelled to emphasize the fact that this is just a single frame render at a modest resolution. Complex scenes are going to take far more time to render, and animation will of course require 24+ frames for every second of footage.
We benchmark many creative applications at Techgage, and we of course have fielded many requests in the past. Interestingly, Blender 2.8 requests have been hitting us quite a bit lately. Even as this was being written, we were hit with a notification from someone else asking for this very content. It’s clear that people are excited about the coming final build!
To summarize, there are two distinct performance avenues with Blender: viewport and rendering. For some, rendering performance might not matter so much, because they can simply sleep through the night while their computer is rendering away. But viewport performance is something that’s impossible to ignore. If your viewport is slow, then you’re going to have a frustrating time. You might even get a real headache.
For Wireframe and Solid modes, you don’t need a considerable GPU to get reasonable performance. You’ll want at least a GTX 1060, or GTX 1660 Ti to ensure 60 FPS in complex scenes. LookDev is the real performance killer, with it taking an RTX 2070 to hit it at the most modest of our three tested resolutions, 1080p. At 4K, you need a serious GPU to deliver a reasonable experience.
As it stands, NVIDIA’s Turing-based graphics cards perform extremely well in every conceivable Blender test we throw at them. We got the best frame rates out of them in the viewport tests, and the best rendering performance in both Cycles and Eevee. Fortunately, we benchmarked many cards, so hopefully the one you have your sights set on makes an appearance.
On the AMD side, for top-tier Blender performance, you’ll want either the Radeon VII, or Vega 64. For modest cards, NVIDIA’s GTX 1660 Ti really does seem to be unbeatable, but there are so many price points covered here, and thus a lot of options to choose from. You just don’t want to go too low-end when it comes to rendering, unless you get some sick pleasure out of exercising your patience.
As made obvious in every single one of the performance graphs here, performance has been tested with the beta version of Blender 2.80, and performance could change between now and the final release (which is still not set in stone). Should that happen, we’ll retest. Maybe not to the same extent, since 30 GPUs is quite a bit, but we’ll have to see how this article is reacted to before moving ahead.
If you have questions not tackled in this article, please leave a comment!