NVIDIA’s Fastest Graphics Card Ever: A Look At The Quadro P6000

by Rob Williams on February 14, 2017 in Graphics & Displays

NVIDIA’s latest and greatest-ever workstation graphics card has arrived: Quadro P6000. This top-tier card is built around NVIDIA’s Pascal architecture, which is produced on a 16nm FinFET process. The card boasts an impressive 3,840 CUDA cores, and not to mention 24GB of super-fast GDDR5X. Let’s check it out.

Page 1 – Introduction

April 30, 2018 Addendum: Updated performance can be found here.

When NVIDIA released its Pascal GeForce series last spring and delivered downright impressive performance, we knew that the company’s Pascal Quadros were going to be something special. And well, the P6000 in particular does prove to be a very special card indeed, for a multitude of reasons.

Considering the fact that NVIDIA’s Maxwell-based Quadro M6000 shared similar specs with the first-gen GeForce TITAN X, it’s easy to jump to conclusions and assume that the P6000 is spec-comparable to the second-gen TITAN X. Well, the two cards are in fact similar, but NVIDIA managed to cram an additional 256 CUDA cores into the P6000, giving it a slight performance boost and securing its right to bear the title: “Fastest NVIDIA GPU Ever!”

As covered last week, NVIDIA has just fleshed out its entire Pascal-based Quadro lineup, now offering options to fit all budgets. The Quadro P6000 sits proud at the top, and like previous generation top-tier Quadros, the P6000 is priced at around $5,000 USD, with Newegg currently offering it for $5,400.

Despite being available for a couple of months now, the P6000 remains difficult to find at etail. Newegg seems to be an exception here; Amazon doesn’t offer a single Pascal Quadro at the moment. System builders like BOXX do, but with the warning of “extended lead time”. So while the P6000 is undeniably the fastest GPU NVIDIA has ever crafted, it might take a little bit of time to acquire.

Nonetheless, let’s take a harder look at what we’re dealing with:

	NVIDIA Pascal Quadro Roundup
	Cores	Core MHz	Memory	Mem MHz	Mem Bus	TDP
Quadro GP100	3584 (FP32) 1792 (FP64)	TBD	16GB ¹	TBD	TBD	TBD
Quadro P6000	3840	1417	24GB ²	9008	384-bit	250W
Quadro P5000	2560	1607	16GB ²	9008	256-bit	180W
Quadro P4000	1792	TBD	8GB ³	TBD	TBD	TBD
Quadro P2000	1024	TBD	5GB ³	TBD	TBD	TBD
Quadro P1000	640	TBD	4GB ³	TBD	TBD	TBD
Quadro P600	384	TBD	2GB ³	TBD	TBD	TBD
Quadro P400	256	TBD	2GB ³	TBD	TBD	TBD
¹ HBM2; ² GDDR5X; ³ GDDR5

To address the elephant in the room, the Quadro GP100 is different from the P6000 in its focus (and price; I’d expect the GP100 to cost at least 25% more). The GP100 is unique in that it bundles in dedicated CUDA cores for ultra-fast double-precision floating-point performance. Whereas the P6000 musters ~375 GFLOPS of DP performance, the GP100 stomps that with its 5 TFLOPS.

It’s also worth noting that the GP100 is ideal for those seeking out fast half-precision performance, as it boasts the incredible promise of 20 TFLOPS FP16 (2xFP32). Since the P6000 is a GP102 chip, it doesn’t have the same FP16 scaling, and in fact, the half-precision performance is 1/64 of its FP32 rate, or roughly 187 GFLOPS – yes, half the performance of its FP64 rating.

That all said, the GP100 is designed almost as a solution for those who require both a high-end Quadro and a high-end Tesla, where market-leading compute isn’t just needed, but also huge graphics performance.

The P6000 still does have one trick up its sleeve, though, and that’s 256 CUDA cores over the GP100. That means that for typical Quadro workloads, the P6000 is going to be faster overall. It’s when compute becomes an important requirement that the GP100 should be opted for instead.

The table below helps illustrate the improvements NVIDIA’s made to its top-end Quadro over the past couple of generations. Both the K6000 and M6000 included 12GB of VRAM at launch, although the second-gen M6000 bumped that to 24GB, preemptively matching the P6000. Both single- and double-precision performance have seen significant increases with each new generation, and the same applies to the chip’s complexity.

	NVIDIA Quadro Generational Improvements
	Process	TDP	FP32	FP64	Memory	Transistors
Quadro P6000	16nm	250W	12 TFLOPS	375 GFLOPS	24GB	12 Billion
Quadro M6000	28nm	250W	7 TFLOPS	190 GFLOPS	12GB	8 Billion
Quadro K6000	28nm	225W	5.2 TFLOPS	173 GFLOPS	12GB	7.1 Billion

Like the Quadro M6000, the P6000 includes 4x DisplayPort connectors in addition to a single DVI-D connector. A single card can support: 8K @ 30Hz, 5K @ 60Hz, and 4K @ 60Hz. I am not sure if multiple 8K monitors can be used off of a single card, but NVIDIA does give explicit support for 5K and 4K x 4.

PNY’s Quadro P6000 includes 3x DP-to-DVI adapters, a stereo extension card, and in case your power supply doesn’t include an 8-pin connector, a dual 6-pin to 8-pin adapter.

Alongside the Quadro P6000 is an update to another piece of NVIDIA gear: Quadro Sync. With Quadro Sync II, users can combine the efforts of up to four GPUs to make certain that the frames outputted to their displays are in perfect sync. In the vast majority of usage cases where multiple displays (or even windows) are used, an absolute perfect sync might not matter, but there are other use cases – like broadcast – where it’s imperative.

Before it became a gaming technology part of NVIDIA’s GeForce line, Quadro Sync used to be called “G-SYNC”. Whereas on the gaming side, monitors with G-SYNC technology baked-in are required (along with an NVIDIA graphics card), Quadro Sync II can synchronize frames regardless of the monitor model. The card calls the shots; not the monitors. Tying further into the broadcast example, the Sync II card can also be used to generate a house sync, saving you money if you don’t already own a sync generator (but need one).

Before moving into performance, there are a couple of other quick things to mention. The memory solution on the P6000, and also the P5000, is super-fast GDDR5X, much like it is with NVIDIA’s top-end Pascal gaming cards. On these Quadros, though, users are able to enable ECC mode if it’s needed (or simply desired).

While it hasn’t been covered up to this point, the VR push on the latest Quadros is in overdrive, with NVIDIA trying to prove that VR will be huge in the enterprise space – something I agree with. Over the past year, I’ve experienced a handful of VR demos, some revolving around Iray, and after spending just a few moments with each, it’s not hard to understand what kind of impact VR can have for product or video creation, or even architecture, for that matter. With NVIDIA’s annual GPU Technology Conference set to take place this May, I’m sure we’ll be finding many cool examples of this there.

Performance Testing The Quadro P6000

On the following pages, we’ll be putting NVIDIA’s latest top-end Quadro through a gauntlet of real-world and synthetic tests, utilizing apps from Autodesk, Adobe, SPEC, SiSoftware, and a handful of others (including light gaming tests for good measure).

All tests are run at least twice to produce an accurate result, and if for some reason an odd result creeps up, we do a third run. In the case of this particular review, no tests had to go that route, as most of the benchmarks are very good at delivering similar results with each repeated run.

Our Windows 7 Ultimate x64 test OS has a couple of key Windows services disabled (Search, Defender, Firewall, and Update), as well as Aero. During all testing, the display is kept in 4K resolution, with two exceptions: SPECapc Maya 2012 and SPECviewperf are run with a 1080p resolution. Further, Vsync, G-SYNC, and FreeSync are disabled.

Our test system is as follows:

	Techgage Workstation Test System
Processor	Intel Core i7-5960X (8-core; 3GHz)
Motherboard	ASUS X99-DELUXE
Memory	Corsair Vengeance 32GB (8x4GB; DDR3-2133 11-12-11)
Graphics	NVIDIA GeForce GTX TITAN X 12GB (GeForce 353.30) NVIDIA Quadro P6000 24GB (Quadro 376.62) NVIDIA Quadro M6000 12GB (Quadro 352.86) NVIDIA Quadro M2000 4GB (Quadro 362.13) NVIDIA Quadro K5200 8GB (Quadro 353.30) NVIDIA Quadro K5000 4GB (Quadro 353.30) AMD Radeon Pro WX 5100 8GB (16.12.1) AMD Radeon Pro WX 4100 4GB (16.12.1) AMD FirePro W4300 4GB (FirePro 15.201)
Audio	Onboard
Storage	Kingston HyperX 3K 480GB SSD
Power Supply	Cooler Master Silent Pro Hybrid 1300W
Chassis	Cooler Master Storm Trooper
Cooling	Thermaltake WATER3.0 Extreme Liquid
Displays	Acer XB280HK 28″ 4K G-SYNC Monitor
Et cetera	Windows 7 Professional 64-bit

With that all covered, it’s time to jump right into the test results.

Page List:

Top

1. Introduction
2. Rendering: Autodesk 3ds Max, OctaneBench, LuxMark & Cinebench
3. Encoding & CAD: Adobe Premiere Pro CC & Autodesk AutoCAD 2015
4. SPEC: SPECapc 3ds Max & Maya, SPECviewperf & SPECwpc
5. Sandra: Processing, Cryptography, Scientific, Financial & Bandwidth
6. Gaming: Futuremark 3DMark & Unigine Heaven
7. Power, Temperatures & Final Thoughts

Next Page >>

Support our efforts! With ad revenue at an all-time low for written websites, we're relying more than ever on reader support to help us continue putting so much effort into this type of content. You can support us by becoming a Patron, or by using our Amazon shopping affiliate links listed through our articles. Thanks for your support!

Rob Williams

Rob founded Techgage in 2005 to be an 'Advocate of the consumer', focusing on fair reviews and keeping people apprised of news in the tech world. Catering to both enthusiasts and businesses alike; from desktop gaming to professional workstations, and all the supporting software.