NVIDIA’s Fastest Graphics Card Ever: A Look At The Quadro P6000

Print
by Rob Williams on February 14, 2017 in Graphics & Displays

NVIDIA’s latest and greatest-ever workstation graphics card has arrived: Quadro P6000. This top-tier card is built around NVIDIA’s Pascal architecture, which is produced on a 16nm FinFET process. The card boasts an impressive 3,840 CUDA cores, and not to mention 24GB of super-fast GDDR5X. Let’s check it out.

Page 5 – Sandra: Processing, Cryptography, Scientific, Financial & Bandwidth

On the previous page, I mentioned that SPEC is an organization that crafts some of the best benchmarks going, and in a similar vein, I can compliment SiSoftware. This is a company that thrives on offering support for certain technologies before those technologies are even available to the consumer. In that regard, its Sandra benchmark might seem a little bleeding-edge, but at the same time, its tests are established, refined, and really accurate across multiple runs.

For the purposes of a workstation graphics card review, we focus on four main tests: general GPU processing, cryptography, financial analysis, and scientific analysis. Some of these tests produce complex results, so those will be displayed in a table rather than a graph.

SiSoftware Sandra

GPU Processing

Sandra 2015 – GPU Processing
P6000 M6000 K5200 M2000
CUDA: Single-Float 17.38 GPix/s 9.13 GPix/s 4.16 GPix/s 2.48 GPix/s
OpenCL: Single-Float 15.4 GPix/s 8.10 GPix/s 3.37 GPix/s 2.19 GPix/s
CUDA: Half-Float 17.26 GPix/s 9.05 GPix/s 4.13 GPix/s 2.47 GPix/s
OpenCL: Half-Float 15.45 GPix/s 8.2 GPix/s 3.39 GPix/s 2.19 GPix/s
CUDA: Double-Float 646.59 MPix/s 344.16 MPix/s 272.68 MPix/s 92.89 MPix/s
OpenCL: Double-Float 646.76 MPix/s 347.83 GPix/s 268.22 MPix/s 185.1 MPix/s
CUDA: Quad-Float 27.24 MPix/s 12.69 MPix/s 11.54 MPix/s 4 MPix/s
OpenCL: Quad-Float 25.19 MPix/s 13.59 MPix/s 19.62 MPix/s 8.37 MPix/s
Results in pixels-per-second. 1 GPix = 1,000 MPix; 1 MPix = 1,000 kPix.

In some of the tests on the previous pages, the P6000 has struggled to shine, but Sandra is having none of that. In raw throughput, the P6000 is roughly double the performance of the M6000. In some cases, it’s 88% faster, and with the quad-float CUDA test, the P6000 actually manages to be more than twice as fast (114%).

Cryptography

NVIDIA Quadro P6000 - Sandra - Cryptography (High)
NVIDIA Quadro P6000 - Sandra - Cryptography (Higher)

The awesome results keep coming for the Quadro P6000. Overall, it’s safe to say that the P6000 is twice as fast where encryption is concerned. I’m not sure of the reason for the specific gain, but CUDA hashing sees dramatic improvement on Pascal. Further testing showed that NVIDIA’s own driver improvements had some hand in these increases, but the architectural boost played the largest role.

Financial Analysis

Sandra 2015 – Financial Analysis (Single Precision)
P6000 M6000 K5200 M2000
CUDA: Black-Scholes 11.62 G/s 8.14 G/s 3.44 G/s 2.12 G/s
OpenCL: Black-Scholes 11.54 G/s 8.10 G/s 4.49 G/s 1.58 G/s
CUDA: Binomial 3 M/s 1.58 M/s 676.64 k/s 445.48 k/s
OpenCL: Binomial 3.15 M/s 1.60 M/s 645.42 k/s 375.33 k/s
CUDA: Monte Carlo 6.49 M/s 3 M/s 1.20 M/s 883.6 k/s
OpenCL: Monte Carlo 6.42 M/s 2.81 M/s 1.18 M/s 756.45 k/s
Results in options-per-second. 1 GOPS = 1,000 MOPS; 1 MOPS = 1,000 kOPS.
Sandra 2015 – Financial Analysis (Double Precision)
P6000 M6000 K5200 M2000
CUDA: Black-Scholes 1.33 G/s 700 M/s 541.32 M/s 193.91 M/s
OpenCL: Black-Scholes 1.3 G/s 691.82 M/s 533.91 M/s 235.91 M/s
CUDA: Binomial 131.83 k/s 70.32 k/s 52.55 k/s 19 k/s
OpenCL: Binomial 132 k/s 71.45 k/s 52.93 k/s 15.79 k/s
CUDA: Monte Carlo 272.54 k/s 147.71 k/s 112.53 k/s 40 k/s
OpenCL: Monte Carlo 272.62 k/s 147.79 k/s 112.43 k/s 35.86 k/s
Results in options-per-second. 1 GOPS = 1,000 MOPS; 1 MOPS = 1,000 kOPS.

The P6000 continues to impress here, with varying degrees of improvement being seen from test to test, but with all of the improvements being substantial. The OpenCL Monte Carlo test, for example, exhibited a 128% performance boost on the P6000, versus the M6000 (which is still a seriously powerful GPU!)

Scientific Analysis

Sandra 2015 – Scientific Analysis (Single Precision)
M6000 P6000 K5200 M2000
CUDA: GEMM 5.53 TFLOPS 3.2 TFLOPS 1.1 TFLOPS 951.73 GFLOPS
OpenCL: GEMM 6.81 TFLOPS 3.6 TFLOPS 1 TFLOPS 983.37 GFLOPS
CUDA: FFT 261.88 GFLOPS 204.3 GFLOPS 80.8 GFLOPS 54.77 GFLOPS
OpenCL: FFT 268.44 GFLOPS 220.7 GFLOPS 97.0 GFLOPS 65.24 GFLOPS
CUDA: NBDY 5.78 TFLOPS 2.9 TFLOPS 1 TFLOPS 915.53 GFLOPS
OpenCL: NBDY 5 TFLOPS 3 TFLOPS 1 TFLOPS 601.82 GFLOPS
Results in floating-point operations-per-second. GEMM = General Matrix Multiply; FFT = Fast Fourier Transform; NBDY = N-Body Simulation.
Sandra 2015 – Scientific Analysis (Double Precision)
P6000 M6000 K5200 M2000
CUDA: GEMM 325 GFLOPS 175.1 GFLOPS 147.8 GFLOPS 48.11 GFLOPS
OpenCL: GEMM 325.11 GFLOPS 174.6 GFLOPS 148.0 GFLOPS 49.64 GFLOPS
CUDA: FFT 111.38 GFLOPS 89.1 GFLOPS 48.7 GFLOPS 28 GFLOPS
OpenCL: FFT 131.79 GFLOPS 120.3 GFLOPS 58.6 GFLOPS 36.16 GFLOPS
CUDA: NBDY 189.8 GFLOPS 103.0 GFLOPS 112.1 GFLOPS 38.17 GFLOPS
OpenCL: NBDY 190.25 GFLOPS 103.6 GFLOPS 111.9 GFLOPS 51.18 GFLOPS
Results in floating-point operations-per-second. GEMM = General Matrix Multiply; FFT = Fast Fourier Transform; NBDY = N-Body Simulation.

To help wrap up our Sandra results, we have more proof that the Quadro P6000 is a really, really fast card. In the worst case, gains of 25% can be seen; in the best case, 99% (CUDA N-Body).

Support our efforts! With ad revenue at an all-time low for written websites, we're relying more than ever on reader support to help us continue putting so much effort into this type of content. You can support us by becoming a Patron, or by using our Amazon shopping affiliate links listed through our articles. Thanks for your support!

Rob Williams

Rob founded Techgage in 2005 to be an 'Advocate of the consumer', focusing on fair reviews and keeping people apprised of news in the tech world. Catering to both enthusiasts and businesses alike; from desktop gaming to professional workstations, and all the supporting software.

twitter icon facebook icon instagram icon