NVIDIA TITAN RTX Workstation Performance Review

by Rob Williams on July 24, 2019 in Graphics & Displays

NVIDIA’s TITAN RTX means business – and a lot of it. This jack-of-all-trades graphics card caters to those with serious visual computing needs, whether it be designing and rendering 3D scenes, or poring over repositories of photos or other data with deep-learning work.

Page 1 – Introduction & Testing References

NVIDIA’s TITAN series of graphics cards has been an interesting one since the launch of the original in 2013. That Kepler-based GTX TITAN model peaked at 4.5 TFLOPS single-precision (FP32), performance that was boosted to 5.1 TFLOPS with the release of the TITAN Black the following year.

Fast-forward to the present day, where we now have the TITAN RTX, boasting 16.3 TFLOPS of single-precision, and 32.6 TFLOPS of half-precision (FP16). Double-precision (FP64) used to be standard fare on the earlier TITANs, but today, you’ll need the Volta-based TITAN V for unlocked performance (6.1 TFLOPS), or AMD’s Radeon VII for partially unlocked performance (3.4 TFLOPS).

Lately, half-precision has garnered a lot of attention by the ProViz market, since it’s ideal for use with deep-learning and AI, things that are growing in popularity at a ridiculously quick pace. Add specifically tuned Tensor cores to the mix, and deep-learning performance on Turing becomes truly impressive.

Tensors are not the only party trick the TITAN RTX has. Like the rest of the RTX line (on both the gaming and pro side), RT cores are present in the TITAN RTX, useful for accelerating real-time ray tracing workloads. The cores need to be specifically supported by developers, using APIs such as DXR and VKRay. While support for NVIDIA’s technology started off tepid, industry support has grown a lot since the original unveiling of RTX at SIGGRAPH last year.

At E3 in June, a handful of games had ray tracing-related announcements, including Watch_Dogs: Legion, Cyberpunk 2077, Call of Duty: Modern Warfare, and of course, Quake II RTX. On the design side, some developers have already released their RTX accelerated solutions, while many more are in the works. NVIDIA has been talking a lot lately about the Adobes and Autodesks of the world helping to grow the list of RTX-infused software. We wouldn’t be surprised if more RTX goodness was revealed at SIGGRAPH this year yet again.

For deep-learning, the TITAN RTX’s strong FP16 performance is fast on its own, but there are a few perks onboard to help take things to the next level. The Tensor cores aid in much of the acceleration, but the ability to use mixed precision is another big part. With it, minimal data tracking will be stored in single-precision, while the key data will get crunched in half-precision. Everything combined, this can boost training performance by 3x over the base GPU.

NVIDIA's TITAN RTX & GeForce RTX 2080 Ti - Backs

Also notable for Turing is concurrent integer / floating-point operations, which allows games (or software) to execute INT and FP operations in parallel without tripping over each other in the pipeline. NVIDIA has noted in the past that with games like Shadow of the Tomb Raider, a sample set of 100 instructions included 62 FP and 38 INT, and that this concurrent feature directly improves performance as a result.

Another important feature of TITAN RTX is its ability to use NVLink, which essentially combines the memory pools of two cards together, resulting in a single framebuffer that can be used for the biggest possible projects. Since GPUs scale generally very well with the sorts of workloads the card targets, it’s the true memory pooling that’s going to offer the greatest benefit here. Gaming content that could also take advantage of multi-GPU would see a benefit with two cards and this connector, as well.

Because it’s a feature exclusive to these RTX GPUs right now, it’s worth mentioning that NVIDIA also bundles a VirtualLink port at the back, allowing you to plug in your HMD for VR, or in the worst case, use it as a full-powered USB-C port, either for data transfer or phone charging.

With all of that covered, let’s take a quick look at the overall current NVIDIA workstation stack:

	NVIDIA’s Quadro & TITAN Workstation GPU Lineup
	Cores	Base MHz	Peak FP32	Memory	Bandwidth	TDP	Price
GV100	5120	1200	14.9 TFLOPS	32 GB ⁸	870 GB/s	185W	$8,999
RTX 8000	4608	1440	16.3 TFLOPS	48 GB ⁵	624 GB/s	???W	$5,500
RTX 6000	4608	1440	16.3 TFLOPS	24 GB ⁵	624 GB/s	295W	$4,000
RTX 5000	3072	1350	11.2 TFLOPS	16 GB ⁵	448 GB/s	265W	$2,300
RTX 4000	2304	1005	7.1 TFLOPS	8 GB ¹	416 GB/s	160W	$900
TITAN RTX	4608	1350	16.3 TFLOPS	24 GB ¹	672 GB/s	280W	$2,499
TITAN V	5120	1200	14.9 TFLOPS	12 GB ⁴	653 GB/s	250W	$2,999
P6000	3840	1417	11.8 TFLOPS	24 GB ⁶	432 GB/s	250W	$4,999
P5000	2560	1607	8.9 TFLOPS	16 GB ⁶	288 GB/s	180W	$1,999
P4000	1792	1227	5.3 TFLOPS	8 GB ³	243 GB/s	105W	$799
P2000	1024	1370	3.0 TFLOPS	5 GB ³	140 GB/s	75W	$399
P1000	640	1354	1.9 TFLOPS	4 GB ³	80 GB/s	47W	$299
P620	512	1354	1.4 TFLOPS	2 GB ³	80 GB/s	40W	$199
P600	384	1354	1.2 TFLOPS	2 GB ³	64 GB/s	40W	$179
P400	256	1070	0.6 TFLOPS	2 GB ³	32 GB/s	30W	$139
Notes	¹ GDDR6; ² GDDR5X; ³ GDDR5; ⁴ HBM2 ⁵ GDDR6 (ECC); ⁶ GDDR5X (ECC); ⁷ GDDR5 (ECC); ⁸ HBM2 (ECC) Architecture: P = Pascal; V = Volta; RTX = Turing

The TITAN RTX matches the Quadro RTX 6000 and 8000 for having the highest number of cores in the Turing lineup. NVIDIA says the TITAN RTX is about 3 TFLOPS faster in FP32 over the RTX 2080 Ti, and fortunately, we have results for both cards covering a wide-range of tests to see how they compare.

What’s not seen in the specs table above is the actual performance of the ray tracing and deep-learning components. This next table helps clear some of that up:

	NVIDIA’s Quadro & TITAN – RTX Performance
	RT Cores	RTX-OPS	Rays Cast ¹	FP16 ²	INT8 ³	Deep-learning ²
TITAN RTX	72	84 T	11	32.6	206.1	130.5
RTX 8000	72	84 T	10	32.6	206.1	130.5
RTX 6000	72	84 T	10	32.6	206.1	130.5
RTX 5000	48	62 T	8	22.3	178.4	89.2
RTX 4000	36	43 T	6	14.2	28.5	57
Notes	¹ Giga Rays/s; ² TFLOPS; ³ TOPS

You’ll notice that the TITAN RTX has a higher “rays cast” spec than the top Quadros, which might owe its thanks to higher clocks. The other specs are identical across the top three GPUs, with obvious downgrading taking place as we move downward. Currently, the Quadro RTX 4000 (roughly a GeForce RTX 2070 equivalent) is the lowest-end current-gen Quadro from NVIDIA. Again, SIGGRAPH is almost upon us, so it could be that NVIDIA will have a hardware surprise in store; perhaps an RTX 2060 Quadro equivalent.

When the RTX 2080 Ti already offers so much performance, who exactly is the TITAN RTX for? NVIDIA is targeting it largely at researchers, but it secondarily acts as one of the fastest ProViz cards on the market. It could be opted for by those who want the fastest GPU solution going, and not to mention a huge 24GB framebuffer. 24GB might be a bit much for a lot of current visualization work, but with deep-learning, 24GB provides a lot of breathing room.

Despite all it offers, TITAN RTX can’t be called an “ultimate” solution for ProViz since it lacks some Quadro optimizations that the namesake GPUs have. That means in certain high-end design suites like Siemens NX, a true Quadro might prove better. But if you don’t use any workloads that experience specific enhancements, the TITAN RTX is going to be quite attractive given its feature-set (and that framebuffer!) If you’re ever confused about optimizations in your software of choice, please leave a comment!

A couple of years ago, NVIDIA decided to give some love to the TITAN series with driver enhancements that brings some parity between TITAN and Quadro. We can now say that TITAN RTX enjoys the same kind of performance boosts that the TITAN Xp did two years ago, something that will be reflected in some of the graphs ahead.

Test PC & What We Test

On the following pages, the results of our workstation GPU test gauntlet will be seen. The tests chosen cover a wide range of scenarios, from rendering to compute, and includes the use of both synthetic benchmarks and tests with real-world applications from the likes of Adobe and Autodesk.

Nineteen graphics cards have been tested for this article, with the list dominated by Quadro and Radeon Pro workstation cards. There’s a healthy sprinkling of gaming cards in there as well, however, to show you any possible optimization that may be taking place on one or the other.

Please note that the testing for this article was conducted a couple of months ago, before an onslaught of travel and product launches. Graphics card drivers released since our testing might improve performance in certain cases, but we wouldn’t expect any notable changes, having sanity checked a bunch of our usual tested software on both AMD and NVIDIA GPUs. Likewise, the previous version of Windows was used for this particular testing, but that also didn’t reveal any disadvantages when we sanity checked in 1903.

In recent months, we’ve spent a lot of time polishing our test suites, and also our internal testing scripts. We’re currently in the process of rebenchmarking a number of GPUs for an upcoming look at ProViz performance with cards from both AMD’s Radeon RX 5700 and NVIDIA’s GeForce SUPER series. Fortunately, results from those cards don’t really eat into a top-end card like the TITAN RTX, so tardiness hasn’t foiled us this time.

The specs of our test rig are seen below:

	Techgage Workstation Test System
Processor	Intel Core i9-9980XE (18-core; 3.0GHz)
Motherboard	ASUS ROG STRIX X299-E GAMING
Memory	HyperX FURY (4x16GB; DDR4-2666 16-18-18)
Graphics	AMD Radeon VII (16GB) AMD Radeon RX Vega 64 (8GB) AMD Radeon RX 590 (8GB) AMD Radeon Pro WX 8200 (8GB) AMD Radeon Pro WX 7100 (8GB) AMD Radeon Pro WX 5100 (8GB) AMD Radeon Pro WX 4100 (4GB) AMD Radeon Pro WX 3100 (4GB) NVIDIA TITAN RTX (24GB) NVIDIA TITAN Xp (12GB) NVIDIA GeForce RTX 2080 Ti (11GB) NVIDIA GeForce RTX 2060 (6GB) NVIDIA GeForce GTX 1080 Ti (11GB) NVIDIA GeForce GTX 1660 Ti (6GB) NVIDIA Quadro RTX 4000 (8GB) NVIDIA Quadro P6000 (24GB) NVIDIA Quadro P5000 (12GB) NVIDIA Quadro P4000 (8GB) NVIDIA Quadro P2000 (5GB)
Audio	Onboard
Storage	Kingston KC1000 960GB M.2 SSD
Power Supply	Corsair 80 Plus Gold AX1200
Chassis	Corsair Carbide 600C Inverted Full-Tower
Cooling	NZXT Kraken X62 AIO Liquid Cooler
Et cetera	Windows 10 Pro build 17763 (1809)
Drivers	AMD Radeon: Adrenaline 19.4.1 AMD Radeon Pro: Enterprise 19.Q1.2 NVIDIA GeForce & TITAN: Creative Ready 419.67 NVIDIA Quadro: Quadro 419.67

Our benchmark results are categorized and spread across the following six pages. On page 2, we’re looking at some CUDA-based renderers, including V-Ray, Redshift, OctaneRender, and Arnold GPU. Some of these will add support for non-CUDA GPUs in time, and when that happens, we’ll introduce the tests to our Radeon benchmarking suite. Page 3 includes a number of more neutral renderers, like Blender, Radeon ProRender, and LuxMark.

On page 4, we’re tackling encoding with the help of Adobe’s Premiere Pro and MAGIX’s Vegas Pro, while page 5 is home to viewport performance, largely covered with the help of SPECviewperf. In total, 8 test results are featured here, covering important design suites like CATIA, SolidWorks, Siemens NX, Creo, as well as Autodesk’s 3ds Max and Maya. Our own Blender viewport test wraps the page up. And speaking of wrapping things up, page 6 covers mathematical performance with the help of SiSoftware’s Sandra.

And with all of that covered, let’s get on with things:

Page List:

Top

1. Introduction & Testing References
2. Rendering: V-Ray, Redshift, OctaneRender & Arnold GPU
3. Rendering: Blender, Radeon ProRender & LuxMark
4. Encoding: Adobe Premiere Pro & MAGIX Vegas
5. Viewport: SolidWorks, CATIA, Siemens NX, Creo, Blender, 3ds Max & Maya
6. Math: Cryptography, Financial & Scientific
7. Final Thoughts

Next Page >>

Support our efforts! With ad revenue at an all-time low for written websites, we're relying more than ever on reader support to help us continue putting so much effort into this type of content. You can support us by becoming a Patron, or by using our Amazon shopping affiliate links listed through our articles. Thanks for your support!

Rob Williams

Rob founded Techgage in 2005 to be an 'Advocate of the consumer', focusing on fair reviews and keeping people apprised of news in the tech world. Catering to both enthusiasts and businesses alike; from desktop gaming to professional workstations, and all the supporting software.