The excitement starts as we get a glimpse into NVIDIA’s next generation of GPUs, and it’s starting with the professional graphics market with Turing-based Quadro RTX GPUs with brand-new dedicated ray tracing cores. There’s a lot to cover, and a huge number of threads to weave together, but it’s all starting to make sense.
A couple of weeks ago, we managed to shoot a couple of questions off to the creator of recursive ray tracing, J Turner Whitted, as he recounted how he developed the techniques used as the foundation for all modern ray tracing engines, some 40 years ago at SIGGRAPH. Rendering in real time was out of the question back then, but now, it’s an entirely different story.
Late last year we were introduced to the Volta architecture, as a follow-up to Pascal used in gaming and professional cards. Volta was, and still is, mainly about deep learning, as the Tensor cores inside the GPUs are half-precision floating point and integer accelerators used extensively in quick and dirty math. They weren’t really meant for gaming, despite how powerful the Titan V was when it launched. However, Tensor cores still had a use outside of deep learning in graphics, mainly as denoise accelerators for 3D rendering.
The next successor to Pascal is now upon us in the form of Turing. The rumors of the name have been circulating for a while, but there was no product stack to back it up – that’s now about to change. The new Quadro RTX GPUs will be the first cards to officially make use of the new architecture, and they bring with them new processing technologies dedicated to ray tracing.
The name RTX is from NVIDIA’s recent real-time ray tracing extension (RTX) that was co-announced with DirectX Raytracing, and is a means of adding ray tracing elements to rastered graphics in video games, as detailed with UL’s TimeSpy demo, and Unreal Engine used for a now famous Star Wars render.
There are three new Quadros being announced, starting with the Quadro RTX 5000, the 6000, and for the first time, an 8000 series Quadro. These Quadros embody many brand-new technologies, some expected, others are quite surprising. One thing that will become clear is how difficult it will be to define the different ‘cores’ that now make up the GPU.
GPU |
Memory |
Memory with NVLink |
Ray Tracing |
CUDA Cores |
Tensor Cores |
Quadro RTX 8000 |
48GB |
96GB |
10 GigaRays/sec |
4,608 |
576 |
Quadro RTX 6000 |
24GB |
48GB |
10 GigaRays/sec |
4,608 |
576 |
Quadro RTX 5000 |
16GB |
32GB |
6 GigaRays/sec |
3,072 |
384 |
First of all is the new RT Cores, which enable real-time ray tracing of objects, and is the namesake of the new GPUs. How and where these will be used is still a bit fuzzy at this time, but will focus on acceleration inside of professional applications, such as Maya, Solidworks, and other software. Performance of which will be measured in GigaRays – there’s a new word for you to get used too. There was gaming mentioned, specifically from Unreal Engine, but no games specifically at this time. Oh, and NVIDIA pulled a fast-one on everyone – they showed the Star Wars scene to the audience that was done on a DGX station, that’s 4x GV100 GPUs, and then said they lied. It was all rendered on a single Quadro RTX GPU.
Next are the Turing Tensor Cores, used for AI-enhanced rendering (the denoising), plus other software that makes use of Tensor cores. The Turing CUDA cores are now mix mode, as they can perform float and integer math at the same time. These will work in conjunction with the RT and CUDA cores when rendering scenes.
The big surprise (or maybe not so big) is that the Quadro RTX cards will be the first GPUs equipped with Samsung’s GDDR6 memory, up to 48GB for the Quadro RTX 8000. Another first is the inclusion of the all-new (like seriously new, it was announced last month), USB Type-C powered VirtualLink interface, the all-in-one interface for VR headsets. Hidden away in the overview image was also real-time 8K video encoding – yes, encode, not decode.
What received a fair bit of focus too was NVLink, which can be seen as the successor to SLI, as it allows Quadros to work in tandem with shared resources, rather than cloned with SLI. This means two Quadro RTX 8000 GPUs with 48GB of VRAM, can be combined to create a single pool of 96GB – big enough for render farms to make use of.
This leads on to the Quadro RTX Servers, which are basically DGX stations for render farms instead of deep learning. NVIDIA is pushing hard for these to be used in final rendering, rather than workstation work. Traditionally, and even now, the CPU is used for final render, simply because the scenes are too complex for a GPU to handle, requiring too many assets to fit inside the framebuffer of a GPU. With the large framebuffer of 96GB, and the dedicated RT cores, we might start to see a shift over from CPU based render racks, over to GPU based – at least for some workloads. If a scene requires 500GB of memory, you’ll still need CPUs.
There’s a lot to take in here, and we won’t see the full impact for a couple of months, but if this is really the start of real-time ray tracing in video games, then photorealism won’t just be a fantasy anymore. For now, motion pictures that rely heavily on digital assets and rendering, will soon find themselves with an extravagant amount of power at their fingertips very soon.