NVIDIA’s Fastest Graphics Card Ever: A Look At The Quadro P6000

Print
by Rob Williams on February 14, 2017 in Graphics & Displays

NVIDIA’s latest and greatest-ever workstation graphics card has arrived: Quadro P6000. This top-tier card is built around NVIDIA’s Pascal architecture, which is produced on a 16nm FinFET process. The card boasts an impressive 3,840 CUDA cores, and not to mention 24GB of super-fast GDDR5X. Let’s check it out.

Introduction

When NVIDIA released its Pascal GeForce series last spring and delivered downright impressive performance, we knew that the company’s Pascal Quadros were going to be something special. And well, the P6000 in particular does prove to be a very special card indeed, for a multitude of reasons.

Considering the fact that NVIDIA’s Maxwell-based Quadro M6000 shared similar specs with the first-gen GeForce TITAN X, it’s easy to jump to conclusions and assume that the P6000 is spec-comparable to the second-gen TITAN X. Well, the two cards are in fact similar, but NVIDIA managed to cram an additional 256 CUDA cores into the P6000, giving it a slight performance boost and securing its right to bear the title: “Fastest NVIDIA GPU Ever!”

As covered last week, NVIDIA has just fleshed out its entire Pascal-based Quadro lineup, now offering options to fit all budgets. The Quadro P6000 sits proud at the top, and like previous generation top-tier Quadros, the P6000 is priced at around $5,000 USD, with Newegg currently offering it for $5,400.

Despite being available for a couple of months now, the P6000 remains difficult to find at etail. Newegg seems to be an exception here; Amazon doesn’t offer a single Pascal Quadro at the moment. System builders like BOXX do, but with the warning of “extended lead time”. So while the P6000 is undeniably the fastest GPU NVIDIA has ever crafted, it might take a little bit of time to acquire.

Nonetheless, let’s take a harder look at what we’re dealing with:

NVIDIA Pascal Quadro Roundup
CoresCore MHzMemoryMem MHzMem BusTDP
Quadro GP1003584 (FP32)
1792 (FP64)
TBD16GB 1TBDTBDTBD
Quadro P60003840141724GB 29008384-bit250W
Quadro P50002560160716GB 29008256-bit180W
Quadro P40001792TBD8GB 3TBDTBDTBD
Quadro P20001024TBD5GB 3TBDTBDTBD
Quadro P1000640TBD4GB 3TBDTBDTBD
Quadro P600384TBD2GB 3TBDTBDTBD
Quadro P400256TBD2GB 3TBDTBDTBD
1 HBM2; 2 GDDR5X; 3 GDDR5

To address the elephant in the room, the Quadro GP100 is different from the P6000 in its focus (and price; I’d expect the GP100 to cost at least 25% more). The GP100 is unique in that it bundles in dedicated CUDA cores for ultra-fast double-precision floating-point performance. Whereas the P6000 musters ~375 GFLOPS of DP performance, the GP100 stomps that with its 5 TFLOPS.

It’s also worth noting that the GP100 is ideal for those seeking out fast half-precision performance, as it boasts the incredible promise of 20 TFLOPS FP16 (2xFP32). Since the P6000 is a GP102 chip, it doesn’t have the same FP16 scaling, and in fact, the half-precision performance is 1/64 of its FP32 rate, or roughly 187 GFLOPS – yes, half the performance of its FP64 rating.

That all said, the GP100 is designed almost as a solution for those who require both a high-end Quadro and a high-end Tesla, where market-leading compute isn’t just needed, but also huge graphics performance.

The P6000 still does have one trick up its sleeve, though, and that’s 256 CUDA cores over the GP100. That means that for typical Quadro workloads, the P6000 is going to be faster overall. It’s when compute becomes an important requirement that the GP100 should be opted for instead.

The table below helps illustrate the improvements NVIDIA’s made to its top-end Quadro over the past couple of generations. Both the K6000 and M6000 included 12GB of VRAM at launch, although the second-gen M6000 bumped that to 24GB, preemptively matching the P6000. Both single- and double-precision performance have seen significant increases with each new generation, and the same applies to the chip’s complexity.

NVIDIA Quadro Generational Improvements
ProcessTDPFP32FP64MemoryTransistors
Quadro P600016nm250W12 TFLOPS375 GFLOPS24GB12 Billion
Quadro M600028nm250W7 TFLOPS190 GFLOPS12GB8 Billion
Quadro K600028nm225W5.2 TFLOPS173 GFLOPS12GB7.1 Billion

Like the Quadro M6000, the P6000 includes 4x DisplayPort connectors in addition to a single DVI-D connector. A single card can support: 8K @ 30Hz, 5K @ 60Hz, and 4K @ 60Hz. I am not sure if multiple 8K monitors can be used off of a single card, but NVIDIA does give explicit support for 5K and 4K x 4.

Advertisment
NVIDIA Quadro P6000 Package Contents

PNY’s Quadro P6000 includes 3x DP-to-DVI adapters, a stereo extension card, and in case your power supply doesn’t include an 8-pin connector, a dual 6-pin to 8-pin adapter.

Alongside the Quadro P6000 is an update to another piece of NVIDIA gear: Quadro Sync. With Quadro Sync II, users can combine the efforts of up to four GPUs to make certain that the frames outputted to their displays are in perfect sync. In the vast majority of usage cases where multiple displays (or even windows) are used, an absolute perfect sync might not matter, but there are other use cases – like broadcast – where it’s imperative.

NVIDIA Quadro Sync II Card

Before it became a gaming technology part of NVIDIA’s GeForce line, Quadro Sync used to be called “G-SYNC”. Whereas on the gaming side, monitors with G-SYNC technology baked-in are required (along with an NVIDIA graphics card), Quadro Sync II can synchronize frames regardless of the monitor model. The card calls the shots; not the monitors. Tying further into the broadcast example, the Sync II card can also be used to generate a house sync, saving you money if you don’t already own a sync generator (but need one).

Before moving into performance, there are a couple of other quick things to mention. The memory solution on the P6000, and also the P5000, is super-fast GDDR5X, much like it is with NVIDIA’s top-end Pascal gaming cards. On these Quadros, though, users are able to enable ECC mode if it’s needed (or simply desired).

While it hasn’t been covered up to this point, the VR push on the latest Quadros is in overdrive, with NVIDIA trying to prove that VR will be huge in the enterprise space – something I agree with. Over the past year, I’ve experienced a handful of VR demos, some revolving around Iray, and after spending just a few moments with each, it’s not hard to understand what kind of impact VR can have for product or video creation, or even architecture, for that matter. With NVIDIA’s annual GPU Technology Conference set to take place this May, I’m sure we’ll be finding many cool examples of this there.

Performance Testing The Quadro P6000

On the following pages, we’ll be putting NVIDIA’s latest top-end Quadro through a gauntlet of real-world and synthetic tests, utilizing apps from Autodesk, Adobe, SPEC, SiSoftware, and a handful of others (including light gaming tests for good measure).

All tests are run at least twice to produce an accurate result, and if for some reason an odd result creeps up, we do a third run. In the case of this particular review, no tests had to go that route, as most of the benchmarks are very good at delivering similar results with each repeated run.

Our Windows 7 Ultimate x64 test OS has a couple of key Windows services disabled (Search, Defender, Firewall, and Update), as well as Aero. During all testing, the display is kept in 4K resolution, with two exceptions: SPECapc Maya 2012 and SPECviewperf are run with a 1080p resolution. Further, Vsync, G-SYNC, and FreeSync are disabled.

Our test system is as follows:

Techgage Workstation Test System
ProcessorIntel Core i7-5960X (8-core; 3GHz)
MotherboardASUS X99-DELUXE
MemoryCorsair Vengeance 32GB (8x4GB; DDR3-2133 11-12-11)
GraphicsNVIDIA GeForce GTX TITAN X 12GB (GeForce 353.30)
NVIDIA Quadro P6000 24GB (Quadro 376.62)
NVIDIA Quadro M6000 12GB (Quadro 352.86)
NVIDIA Quadro M2000 4GB (Quadro 362.13)
NVIDIA Quadro K5200 8GB (Quadro 353.30)
NVIDIA Quadro K5000 4GB (Quadro 353.30)
AMD Radeon Pro WX 5100 8GB (16.12.1)
AMD Radeon Pro WX 4100 4GB (16.12.1)
AMD FirePro W4300 4GB (FirePro 15.201)
AudioOnboard
StorageKingston HyperX 3K 480GB SSD
Power SupplyCooler Master Silent Pro Hybrid 1300W
ChassisCooler Master Storm Trooper
CoolingThermaltake WATER3.0 Extreme Liquid
DisplaysAcer XB280HK 28″ 4K G-SYNC Monitor
Et ceteraWindows 7 Professional 64-bit

With that all covered, it’s time to jump right into the test results.