NVIDIA Quadro RTX 4000 Review

NVIDIA Quadro RTX 4000 Thumb
Print
by Rob Williams on March 16, 2019 in Graphics & Displays

NVIDIA’s Turing-infused Quadro RTX 4000 sets out to be a super-fast performer for its $900 price tag, but it also brings a couple of tricks. Those include some RTX special features, like Tensor and RT cores, which already come in addition to architectural enhancements that helps the card leap far ahead of the older P4000.

Introduction & Testing References

At last August’s SIGGRAPH in Vancouver, it would have been difficult to walk around and not get a whiff of NVIDIA’s RTX. The company’s own booth was nearly impossible to miss, and others had RTX demos going on as well. That included HP, which showed off NVIDIA’s AI prowess with style transfers. With the promises of real-time ray tracing, it was understandably easy to get excited at the show.

With its Tensor and RT cores, Quadro RTX brings a lot of accelerated computing to the table. It’s suited for deep-learning and AI, and will take advantage of real-time ray tracing by applications that support the latest version of NVIDIA’s OptiX engine.

Since the launch of the first three Quadro RTX cards, availability has seemed to be spotty. Multiple readers informed us that they had to wait longer than expected for their preorders to be fulfilled, and we’re honestly not sure at this point if availability has improved that greatly since then. The fact that there is now an RTX 4000 may answer that, but we know better than to jump to conclusions.

NVIDIA Quadro RTX 4000 Workstation Graphics Card

The RTX 4000, as this article might suggest, has become the first Quadro RTX to hit our doorstep. It packs a real punch in comparison to the previous generation Pascal cards, which is before we get into the addition of Tensor and RT cores. The Tensors alone dramatically improve FP16 performance for deep-learning work. In this article, we’re going to see how the RTX 4000 compares to the outgoing Quadro P4000, which debuted a couple of years ago at the same general price point.

NVIDIA’s Quadro Workstation GPU Lineup
CoresBase MHzPeak FP32MemoryBandwidthTDPPrice
GV1005120120014.9 TFLOPS32 GB 8870 GB/s185W$8,999
RTX 80004608144016.3 TFLOPS48 GB 5624 GB/s???W$10,000
RTX 60004608144016.3 TFLOPS24 GB 5624 GB/s295W$6,300
RTX 50003072135011.2 TFLOPS16 GB 5448 GB/s265W$2,300
RTX 40002304???7.1 TFLOPS8 GB 1416 GB/s160W$900
TITAN V5120120014.9 TFLOPS12 GB 4653 GB/s250W$2,999
P60003840141711.8 TFLOPS24 GB 6432 GB/s250W$4,999
P5000256016078.9 TFLOPS16 GB 6288 GB/s180W$1,999
P4000179212275.3 TFLOPS8 GB 3243 GB/s105W$799
P2000102413703.0 TFLOPS5 GB 3140 GB/s75W$399
P100064013541.9 TFLOPS4 GB 380 GB/s47W$299
P62051213541.4 TFLOPS2 GB 380 GB/s40W$199
P60038413541.2 TFLOPS2 GB 364 GB/s40W$179
P40025610700.6 TFLOPS2 GB 332 GB/s30W$139
Notes1 GDDR6; 2 GDDR5X; 3 GDDR5; 4 HBM2
5 GDDR6 (ECC); 6 GDDR5X (ECC); 7 GDDR5 (ECC); 8 HBM2 (ECC)
Architecture: P = Pascal; V = Volta; RTX = Turing

We’re not sure of the exact base clock speed of the RTX 4000, but we do know it peaks at 7.1 TFLOPS FP32, which puts it in the same performance category as the GeForce RTX 2070 on the gaming side of the market. Based on our knowledge of that GPU, the RTX 4000 would offer great gameplay at both 1080p and 1440p, and with the right game, 4K could be possible, too.

The bigger Quadro RTX cards escalate both the performance and the price just the same, with the ultimate top 10 (thousand dollar) card offering a staggering 48GB of HBM2 ECC memory. Should you have heavier memory requirements or the need for better-than-average performance, the RTX 5000 should be a consideration, unless budget constraints act as a roadblock.

The Quadro P4000 is a 5.3 TFLOPS card, so based on that alone, the new RTX 4000 is 34% faster for the same price point. That performance boost hasn’t come without the addition of some watts, but the 160W TDP allows this 4000-series card to remain as a single-slot solution. The card’s power connector is at the end, not the top, which should suit smaller form-factor PCs better.

Compared to previous generations, there’s more than immediately meets the eye with Quadro RTX. The architecture bump from Pascal to Turing in itself represents a big boost in performance (and efficiency), but the addition of Tensor and RT cores helps set RTX apart from the rest of the market. As covered above, Tensors will prove useful in deep-learning and AI, while the RT cores can be used to take advantage of real-time ray tracing in applications which support it.

In the table below, we highlight the performance differences between the four currently available Quadro RTX cards. Turing’s extra processors forced NVIDIA to create the “RTX-OPS” performance metric, so the higher the number, the more capable the card is overall.

NVIDIA’s Quadro RTX Performance
RT CoresRTX-OPSRays CastFP16INT8DL TFLOPS
RTX 80007284 T10 Giga Rays/s32.6 TFLOPS206.1 TOPS130.5 TFLOPS
RTX 60007284 T10 Giga Rays/s32.6 TFLOPS206.1 TOPS130.5 TFLOPS
RTX 50004862 T8 Giga Ray/s22.3 TFLOPS178.4 TOPS89.2 TFLOPS
RTX 40003643 T6 Giga Rays/s14.2 TFLOPS28.5 TOPS57 TFLOPS

On all Pascal-based cards, aside from the GP100, both half- and double-precision compute was crippled, with the performance on offer being supremely worthless to those who could have taken advantage of them. With Turing, double-precision is still restricted to the highest-end cards, but the leash has been taken off of FP16, something that gives the RTX 4000 14.2 TFLOPS to take advantage of.

With its Vega-based GPUs, AMD has offered unlocked half-precision for a couple of years, but it will still have a difficult time competing in deep-learning thanks to its lack of Tensor cores, or something comparable. AMD has talked about its future GPUs that will include similar technologies, so for now, we wait to see what the red team gets up to.

At the moment, there are a number of suites out there that support NVIDIA RTX technologies, and we’ve only begun to explore some of them. We have benchmarks included in this article that take great advantage of Turing itself, but as for the Tensors and RT cores, further analysis on those will come later.

Test PC & What We Test

On the following pages, the results of our WS GPU test gauntlet will be seen. The tests chosen cover a wide range of scenarios, from rendering to compute, and includes the use of both synthetic benchmarks and tests with real-world applications from the likes of Adobe and Autodesk.

Seven graphics cards in total have been tested for this article, which represents the six seen in our Radeon VII review from a few weeks ago, and the addition of the Quadro P4000, since it acts as a useful comparison to the RTX 4000 that replaces it. Another interesting comparison will be AMD’s Radeon Pro WX 8200, which released last fall for around the same price point ($999).

Here are the specs of the test machine:

Techgage Workstation Test System
ProcessorIntel Core i9-7980XE (18-core; 2.6GHz)
MotherboardASUS ROG STRIX X299-E GAMING
MemoryHyperX FURY (4x16GB; DDR4-2666 16-18-18)
GraphicsAMD Radeon VII (16GB; Jan 22 Press Driver)
AMD Radeon Pro WX 8200 (8GB; 18.Q4.1)
NVIDIA GeForce RTX 2080 Ti (11GB; 417.71)
NVIDIA TITAN Xp (12GB; 417.71)
NVIDIA Quadro RTX 4000 (8GB; 412.16)
NVIDIA Quadro P6000 (24GB; 412.16)
NVIDIA Quadro P4000 (8GB; 412.16)
AudioOnboard
StorageKingston KC1000 960GB M.2 SSD
Power SupplyCorsair 80 Plus Gold AX1200
ChassisCorsair Carbide 600C Inverted Full-Tower
CoolingNZXT Kraken X62 AIO Liquid Cooler
Et ceteraWindows 10 Pro build 17763 (1809)
For an in-depth pictorial look at this build, head here.

Benchmark results are categorized and spread across the next four pages. On page 2, Adobe’s Premiere Pro and MAGIX’s Vegas Pro lead our encoding tests, with both AVC and HEVC codecs taken care of. On the same page, Sandra’s financial and scientific performance can be seen, as well as the cryptography.

On page 3, a few renderers are taken care of. These include the popular open-source design suite Blender, as well as LuxMark, and Radeon ProRender. For NVIDIA-specific renderers, Redshift, V-Ray, and OctaneRender also make an appearance.

Page 4 is home to viewport performance, covered with the help of SPEC and its SPECviewperf suite. In total, 8 test results are featured here, covering important design suites like CATIA, SolidWorks, Siemens NX, Creo, as well as Autodesk’s 3ds Max and Maya.

Without further ado, let’s get this train moving.

Rob Williams

Rob founded Techgage in 2005 to be an 'Advocate of the consumer', focusing on fair reviews and keeping people apprised of news in the tech world. Catering to both enthusiasts and businesses alike; from desktop gaming to professional workstations, and all the supporting software.

twitter icon facebook icon googleplus icon instagram icon