Date: July 25, 2016
Author(s): Jamie Fletcher
With SIGGRAPH 2016 in full swing, NVIDIA introduces its plans for GPU based ray-tracing, updates to Iray and mental ray, and extended support of SDKs and APIs. Virtual reality gets special treatment with ray-tracing and spherical composite video. And finally, NVIDIA begins scaling compute in the visual world with its Tesla P100 powered DGX-1 clusters, and introduces a monstrous 12 TFLOP Quadro, the P6000.
If coffee doesn’t do it for you first thing in the morning, maybe this will. Late last week, NVIDIA dropped the mic after announcing its latest GPU, the 11 TFLOP beast that is the TITAN X. Utilizing the Pascal architecture, the new TITAN X showed enormous gains over the previous generation, with a clear lead over even the other Pascal GPUs, the GTX 1080 and 1070.
However, gaming GPUs are not what’s on the launch plate for NVIDIA today at SIGGRAPH, but workstation graphics cards from the Quadro line. If you thought the TITAN X was a powerhouse, then be prepared, as the latest generation Quadros are something special indeed.
In a very unusual turn of events, the top-end Quadro GPU, the P6000, looks to be more powerful than the Pascal powered TITAN X. We’re not talking a little more powerful either – while clock speeds are not known right now, we do know the P6000 has 256 more cores at its disposal, and twice the memory. While the TITAN X hit 11 TFLOPs, the P6000 will be 12 TFLOPs of FP32 – a truly staggering number considering the old Maxwell TITAN X was 6.1 TFLOPs.
The latest TITAN X was announced at an AI research meet-up at Stanford University, rather than a more gaming themed press event we’re used to. In fact, no one even suspected the launch until it happened. The reason though is likely to do with the target audience. Surprising as it might seem, a lot of TITAN cards don’t find their way into gaming systems, but workstations, servers, render farms and research stations. They’re a middle ground between a GTX and a Quadro, resulting in an expensive gaming card or a cheap workstation card, depending on needs.
However, the TITAN X lacks a few key features that come with the Quadro range, namely ECC memory and specialized software support licenses and APIs. It’s just this time around, there is an even bigger difference with the extra cores for faster viewport rendering or CUDA processing. Let’s have a quick look at how the latest P6000 and P5000 compare to the previous Maxwell based M6000 and M5000, as well as the new TITAN X.
|NVIDIA GeForce & Quadro Roundup|
|Cores||Core MHz||Memory||Mem MHz||Mem Bus||TDP|
|TITAN X (Pascal)||3584||1531||12GB||10000||384-bit||250W|
|GeForce GTX 1080||2560||1607||8GB||10000||256-bit||180W|
Not a lot to go on at the moment, but those blanks will be filled in soon. The important thing is the positioning of the new P6000. From what we understand, the TITAN X isn’t using the full GP102 core, while the P6000 is; this could be due to yield issues, and as such the fully functioning chips are being binned for the top-line Quadros. The P5000 on the other hand, looks to be the same chip as the GTX 1080, and sports a similar compute score of 8.9 TFLOPs to the 1080’s 9 TFLOPs (likely just rounding).
However, comparing the P6000 and P5000 to their respective predecessors, the M6000 and M5000, things become more interesting. While the two Maxwell Quadros were released not too long ago, it’s only been a year and they’ve already been superseded. Just going by our previous tests with the performance gains of the GTX 1080 over the GTX 980, we can safely estimate a direct 30-40% performance boost in most compute tasks, but that excludes additional architecture improvements, as well as software updates which we’ll get to a little later.
While it’s no surprise VR is big business in the gaming world, VR is being taken seriously in the professional market too. This ties into the architectural reshuffle of the display handling capabilities of Pascal.
Back with the original launch of Pascal, we covered some of the new features, and one of the unsung heroes was Simultaneous Multi Projection (SMP). This had a two-fold impact on rendering performance with both multiple display handling, and VR perspective correction. When in VR mode, two displays from two perspectives needed to be rendered, then distorted to correct for the optics in head mounted displays. SMP allows for both perspectives to be rendered in a single pass, and on top of that, are rendered with the distortion already applied, reducing the amount of geometry needed to be rendered and a second pass with a post-processor.
These two factors combined mean that Pascal is 50-70% faster than Maxwell when it comes to VR rendering. According to NVIDIA’s own internal VR benchmarks, the P6000 with its much faster hardware, is 80% faster than the M6000 for VR rendering, and the P5000 is 70% faster than the M5000 for the same test. With additional rendering techniques such as Perceptually-Based Foveated VR, there’s some big gains to be made for real-time rendering.
SMP isn’t just for VR though, but multi monitor as well. If your business handles very large, high resolution displays, the new Pascal Quadros can support up to four 5K displays simultaneously. When combined with a Quadro Sync 2 card, up to 8 GPUs can be synchronized for high density displays.
Speaking of real-time rendering, Iray now has support from 3ds Max, Maya, Rhino and Cinema 4D, with spherical panorama snapshots and stereoscopic VR coming soon. For those unfamiliar with Iray, it’s a physically based progressive rendering engine that was originally built on top of mental ray, that can perform either real-time viewport renders or production renders, using NVIDIA’s material definition language (MDL).
It’s an impressive technology that can leverage the rendering power of any connected hardware, be it CPU or GPU, local or server farm. We took a quick look at Iray last year, but over the last six months its matured into a stand-alone plugin compatible with multiple rendering engines, as well as the roll out of the new VR features listed above.
Taking Iray further is integration with Quadro VCA (Visual Computing Appliance) and the new DGX-1 clusters powered by 8x Tesla P100 compute units, which we mentioned a few months back. These DGX-1 clusters will house NVIDIA’s top-tier parts, Pascal powered Tesla cards built on GP100 cores complete with HBM2, similar to what was expected from the latest TITAN X GPU.
While DGX-1 was originally meant for compute, it’s quickly finding a home in render farms. Using the OptiX 4 API and NVLink, four of the DGX-1 Tesla cards can be pooled to create a 64GB interactive GPU memory pool for large data sets – perfect for database crunching and complex render scenes.
The VR capabilities of Iray have a number of possibilities – while showroom floor, walk through renders of cars is often a cited example (realistic physically based renders with changes being made to the vehicle on the spot), it’s also the realm of films that can see benefit.
While there’s more imagination and green screen used in modern films, sometimes it’s difficult to get a different perspective to what you are immersing your audience in when you are stuck with a 30″ monitor in a studio as your only view into the virtual world. Iray VR will allow you to move around the scene and make changes in real-time, even finding new angles for the shot. Further to this, it’s possible to render full VR videos, allowing the viewer to move their head around a scene while guided through a story.
If Iray doesn’t meet your quality standards, or your workflow is built around a different rendering engine such as mental ray, then this next section is for you. mental ray will soon support GPU rendering, similar to that of Iray. It’s still currently in beta (and has been for a while), but will be hitting its major release this September.
If you are lucky enough to be attending SIGGRAPH this year, there is a live demo of Maya with mental ray, showing off its interactive viewport render with Pixar’s Monster’s University, using the all new GPU rendering capabilities. Some new rendering options are available too with GI-Next for Global Illumination, which increase GI rendering by 2-4x times with just the CPU, but an additional 4-5x faster per added GPU. Geometry is handled by the GPU where available, with full support for custom shaders and mental ray effects.
mental ray for Maya allows for final frame rendering within a dedicated viewport, similar to what Iray has done, which can provide an interactive progressive final render of the scene you are working on. Changes to lighting or object positions will all automatically update in the final render, so you can quickly see major changes and sort out the final lighting before sending the whole project off for render.
It’s worth noting as well that since Autodesk has spun off all the rendering engines into their own plugins, NVIDIA now has full control over the update cycle of mental ray as of Maya 2017. This means new features and updates can be rolled out faster, without having to to wait for Maya to implament them.
While we’ve already talked about Iray, it’s worth mentioning a couple more additions, namely its SDK that allows GPU accelerated ray tracing in Dassault CATIA and Siemens NX. This will be bundled with Iray SDK 2016.2 update. There will also be support for X-Rite’s AxF 1.3 format.
MDL support is growing, and for the first time, will have its own SDK rather than just plugin support for packaged rendering engines. This means other applications can start to integrate MDL support, including the likes of Chaos Group (V-Ray) and Adobe. The extent of the integration isn’t known just yet, but it might mean that Photoshop or After Effects could allow artists to change materials and preview them directly without re-rendering the scene in their 3D package. This is something we’ll be keeping an eye on.
VRWorks will be getting 360 Video SDK extensions, simplifying the creation and editing of VR content from non-VR sources. The kit allows for the stitching together of multiple video references (up to 32 video streams, depending on resolution) to create a single spherical video feed in either real-time, or for offline post processing. Additional GPU acceleration helps with encode, calibration and equalization of the different video feeds as well. This all works with GPUDirect for low-latency video as well.
Video capture support for 8K video will be introduced, along with Vulkan API support, so it’ll soon be possible to record or stream ultra high-resolution content through either DirectX or Vulkan rendered content. The video codec engine will be updated to include 8k x 8k video feeds, 10-bit 4:4:4 H.265 (HEVC) encoding and VP9 decoding.
A new tool that will be coming some time this autumn, is GVDB, a GPU accelerated framework for integration with ray-tracing tools, similar to OpenVDB, for rendering volumetric data. This is effectively high-density particle effects such as water and dust, but treated in a physical space and thus affected by various ray-tracing effects, instead of as a post-process filter. While OpenVDB is CPU-based, GVDB will allow such effects to be rendered on the GPU instead, resulting in 10x-30x speed improvements over CPU based rendering alone.
It’s hard to go through all this without wiping the sweat of your brow, as the number of updates, enhancements and new hardware coming in the professional market is staggering. While much is still in the early stages of development (VR), the general feeling this year is that NVIDIA is really pushing hard for ray tracing over GPUs for final production.
It might seem obvious to use GPUs for ray-tracing, but its been CPU only pretty much from its conception. Over the years, GPU acceleration has been involved for rendering of the viewport, but generally, it hasn’t been used for final production as the end result was inferior to what a CPU could produce, regardless of the speed benefits. As such, GPU rendering has had to fight a long-standing stigma.
Over the last year though, GPU rendering has really taken off in the final rendering scene, with no small help from Iray. While viewport rendering is still the primary concern for workstations, GPUs in render farms are now a distinct possibility, with no small help from the addition of GPUs to mental ray. Whether the entire render will be completed on the GPU is another matter, but in another year or two, we could start to see it as a distinct possibility.
The latest Pascal-based Quadros also add weight to the matter, as the sheer compute power of the cards and the flexibility of CUDA, means that new possibilities are being opened all the time. The P6000 is definitely a card worth looking into, as the expected gains over the previous M6000 should be enough to warrant the upgrade, especially considering the extra cores it has over the TITAN X; but pricing will be key here.
If we can get a pair of the new cards in to review, we’ll certainly put them through their paces with Iray rendering, as well as our usual tests. While the P6000 will be stealing headlines, the P5000 won’t be a slouch either. Considering that it’s a GTX 1080 with software enhancements, it should prove to be an extremely capable workstation card; not just for viewports, but for game development too.
From autumn onwards, it’s going to be a very interesting time indeed, as much of the hardware and software are brought together. Who knows, maybe within a couple of years, we’ll finally start to see fully ray-traced games… but for now, VR will continue to evolve into the next level of realism, through immersion.
Copyright © 2005-2019 Techgage Networks Inc. - All Rights Reserved.