April 30, 2018 Addendum: Updated performance can be found here.
When NVIDIA released its Pascal GeForce series last spring and delivered downright impressive performance, we knew that the company’s Pascal Quadros were going to be something special. And well, the P6000 in particular does prove to be a very special card indeed, for a multitude of reasons.
Considering the fact that NVIDIA’s Maxwell-based Quadro M6000 shared similar specs with the first-gen GeForce TITAN X, it’s easy to jump to conclusions and assume that the P6000 is spec-comparable to the second-gen TITAN X. Well, the two cards are in fact similar, but NVIDIA managed to cram an additional 256 CUDA cores into the P6000, giving it a slight performance boost and securing its right to bear the title: “Fastest NVIDIA GPU Ever!”
As covered last week, NVIDIA has just fleshed out its entire Pascal-based Quadro lineup, now offering options to fit all budgets. The Quadro P6000 sits proud at the top, and like previous generation top-tier Quadros, the P6000 is priced at around $5,000 USD, with Newegg currently offering it for $5,400.
Despite being available for a couple of months now, the P6000 remains difficult to find at etail. Newegg seems to be an exception here; Amazon doesn’t offer a single Pascal Quadro at the moment. System builders like BOXX do, but with the warning of “extended lead time”. So while the P6000 is undeniably the fastest GPU NVIDIA has ever crafted, it might take a little bit of time to acquire.
Nonetheless, let’s take a harder look at what we’re dealing with:
|
NVIDIA Pascal Quadro Roundup |
|
Cores |
Core MHz |
Memory |
Mem MHz |
Mem Bus |
TDP |
Quadro GP100 |
3584 (FP32)
1792 (FP64) |
TBD |
16GB 1 |
TBD |
TBD |
TBD |
Quadro P6000 |
3840 |
1417 |
24GB 2 |
9008 |
384-bit |
250W |
Quadro P5000 |
2560 |
1607 |
16GB 2 |
9008 |
256-bit |
180W |
Quadro P4000 |
1792 |
TBD |
8GB 3 |
TBD |
TBD |
TBD |
Quadro P2000 |
1024 |
TBD |
5GB 3 |
TBD |
TBD |
TBD |
Quadro P1000 |
640 |
TBD |
4GB 3 |
TBD |
TBD |
TBD |
Quadro P600 |
384 |
TBD |
2GB 3 |
TBD |
TBD |
TBD |
Quadro P400 |
256 |
TBD |
2GB 3 |
TBD |
TBD |
TBD |
To address the elephant in the room, the Quadro GP100 is different from the P6000 in its focus (and price; I’d expect the GP100 to cost at least 25% more). The GP100 is unique in that it bundles in dedicated CUDA cores for ultra-fast double-precision floating-point performance. Whereas the P6000 musters ~375 GFLOPS of DP performance, the GP100 stomps that with its 5 TFLOPS.
It’s also worth noting that the GP100 is ideal for those seeking out fast half-precision performance, as it boasts the incredible promise of 20 TFLOPS FP16 (2xFP32). Since the P6000 is a GP102 chip, it doesn’t have the same FP16 scaling, and in fact, the half-precision performance is 1/64 of its FP32 rate, or roughly 187 GFLOPS – yes, half the performance of its FP64 rating.
That all said, the GP100 is designed almost as a solution for those who require both a high-end Quadro and a high-end Tesla, where market-leading compute isn’t just needed, but also huge graphics performance.
The P6000 still does have one trick up its sleeve, though, and that’s 256 CUDA cores over the GP100. That means that for typical Quadro workloads, the P6000 is going to be faster overall. It’s when compute becomes an important requirement that the GP100 should be opted for instead.
The table below helps illustrate the improvements NVIDIA’s made to its top-end Quadro over the past couple of generations. Both the K6000 and M6000 included 12GB of VRAM at launch, although the second-gen M6000 bumped that to 24GB, preemptively matching the P6000. Both single- and double-precision performance have seen significant increases with each new generation, and the same applies to the chip’s complexity.
|
NVIDIA Quadro Generational Improvements |
|
Process |
TDP |
FP32 |
FP64 |
Memory |
Transistors |
Quadro P6000 |
16nm |
250W |
12 TFLOPS |
375 GFLOPS |
24GB |
12 Billion |
Quadro M6000 |
28nm |
250W |
7 TFLOPS |
190 GFLOPS |
12GB |
8 Billion |
Quadro K6000 |
28nm |
225W |
5.2 TFLOPS |
173 GFLOPS |
12GB |
7.1 Billion |
Like the Quadro M6000, the P6000 includes 4x DisplayPort connectors in addition to a single DVI-D connector. A single card can support: 8K @ 30Hz, 5K @ 60Hz, and 4K @ 60Hz. I am not sure if multiple 8K monitors can be used off of a single card, but NVIDIA does give explicit support for 5K and 4K x 4.
PNY’s Quadro P6000 includes 3x DP-to-DVI adapters, a stereo extension card, and in case your power supply doesn’t include an 8-pin connector, a dual 6-pin to 8-pin adapter.
Alongside the Quadro P6000 is an update to another piece of NVIDIA gear: Quadro Sync. With Quadro Sync II, users can combine the efforts of up to four GPUs to make certain that the frames outputted to their displays are in perfect sync. In the vast majority of usage cases where multiple displays (or even windows) are used, an absolute perfect sync might not matter, but there are other use cases – like broadcast – where it’s imperative.
Before it became a gaming technology part of NVIDIA’s GeForce line, Quadro Sync used to be called “G-SYNC”. Whereas on the gaming side, monitors with G-SYNC technology baked-in are required (along with an NVIDIA graphics card), Quadro Sync II can synchronize frames regardless of the monitor model. The card calls the shots; not the monitors. Tying further into the broadcast example, the Sync II card can also be used to generate a house sync, saving you money if you don’t already own a sync generator (but need one).
Before moving into performance, there are a couple of other quick things to mention. The memory solution on the P6000, and also the P5000, is super-fast GDDR5X, much like it is with NVIDIA’s top-end Pascal gaming cards. On these Quadros, though, users are able to enable ECC mode if it’s needed (or simply desired).
While it hasn’t been covered up to this point, the VR push on the latest Quadros is in overdrive, with NVIDIA trying to prove that VR will be huge in the enterprise space – something I agree with. Over the past year, I’ve experienced a handful of VR demos, some revolving around Iray, and after spending just a few moments with each, it’s not hard to understand what kind of impact VR can have for product or video creation, or even architecture, for that matter. With NVIDIA’s annual GPU Technology Conference set to take place this May, I’m sure we’ll be finding many cool examples of this there.
Performance Testing The Quadro P6000
On the following pages, we’ll be putting NVIDIA’s latest top-end Quadro through a gauntlet of real-world and synthetic tests, utilizing apps from Autodesk, Adobe, SPEC, SiSoftware, and a handful of others (including light gaming tests for good measure).
All tests are run at least twice to produce an accurate result, and if for some reason an odd result creeps up, we do a third run. In the case of this particular review, no tests had to go that route, as most of the benchmarks are very good at delivering similar results with each repeated run.
Our Windows 7 Ultimate x64 test OS has a couple of key Windows services disabled (Search, Defender, Firewall, and Update), as well as Aero. During all testing, the display is kept in 4K resolution, with two exceptions: SPECapc Maya 2012 and SPECviewperf are run with a 1080p resolution. Further, Vsync, G-SYNC, and FreeSync are disabled.
Our test system is as follows:
With that all covered, it’s time to jump right into the test results.