Techgage logo

Maxwell Hits The Workstation: NVIDIA Quadro M6000 Graphics Card Review

Date: July 28, 2015
Author(s): Rob Williams

What NVIDIA’s GeForce TITAN X does for gaming, its Quadro M6000 does for workstations. As the company’s first Maxwell-based Quadro, the M6000 has a lot going for it: an impressive performance-per-watt rating, support for 4x 4K/60 displays, and despite its 7 TFLOPs performance, requires just a single 8-pin connector.



Introduction

NVIDIA’s latest and greatest workstation graphics card has arrived, and it is intriguing, to say the least. The Quadro M6000 is built around NVIDIA’s latest GPU microarchitecture, Maxwell, which was first seen on the gaming-oriented GeForce GTX 900 GPUs. With that comes myriad perks.

Some of those perks are to be expected. Versus Kepler, Maxwell delivers much-improved performance-per-watt, and in the particular K6000 vs. M6000 battle, the latter is about 35% faster for a gain of 25W on the TDP. At the same time, the card’s ECC memory has been made 29GB/s faster. Further, a single M6000 can support up to four monitors all running at 4K/60.

The culmination of all its enhancements makes the Quadro M6000 a “beast” card; a proper ultra-high-end offering. It’s a no-compromise solution, offering 7 TFLOPs of single-precision performance and is optimized to take advantage of the latest graphics technologies (including NVIDIA’s own).

And, not that it will matter to most, it’s without question the best-looking Quadro to date.

NVIDIA Quadro M6000 - Overview

A major selling-point of NVIDIA’s Quadro M6000 is one shared with all new generations: it’s optimized for the tools people use. But, there are two things that have received a big focus this time around that I’ll be touching on a bit more on this page: optimization of the company’s iray renderer, and its Visual Computing Appliance.

Before we dive into those features and others, let’s take a moment to talk about the hardware.

Side-by-side: NVIDIA's Quadro M6000 & Quadro K5200
Side-by-side: NVIDIA's Quadro M6000 & Quadro K5200
Side-by-side: NVIDIA's Quadro M6000 & Quadro K5200
Video connectors: four DisplayPort & one DVI
Video connectors: four DisplayPort & one DVI
Video connectors: four DisplayPort & one DVI
The Quadro M6000 requires one 8-pin power connector
The Quadro M6000 requires one 8-pin power connector
The Quadro M6000 requires one 8-pin power connector
For the sake of improved cooling, the M6000 includes a backplate
For the sake of improved cooling, the M6000 includes a backplate
For the sake of improved cooling, the M6000 includes a backplate
A high-end quartet: GeForce TITAN X & GTX 980 Ti; Quadro M6000 & K52000
A high-end quartet: GeForce TITAN X & GTX 980 Ti; Quadro M6000 & K52000
A high-end quartet: GeForce TITAN X & GTX 980 Ti; Quadro M6000 & K52000

At the core, NVIDIA’s Quadro M6000 has similar hardware to its current top-end gaming card, the GeForce TITAN X. Outside of the firmware and driver, an important differentiator between the two is that the M6000 utilizes ECC memory (a feature also enjoyed by the K5200.) Compared to the K6000, the M6000 has close to 7% more CUDA cores, an 88MHz gain on the clock, and as mentioned earlier, faster memory.

The M6000 comes equipped with 4x DP 1.2 ports as well as a lone DVI-I. This is a nice jump over the K6000 which offers just two DP 1.2 ports, and with it, users can take advantage of 4x 4K/60Hz displays.

Also worth noting is the fact that the M6000 manages out-do the TITAN X by chopping off the 6-pin connector. All that’s needed here is a single 8-pin connector, allowing for even cleaner system builds. Like the K6000, there’s a stereo connector found at the top, and finally, like the TITAN X, the M6000 includes a backplate for the sake of increased cooling. If multiple M6000s are used, it’s recommended that the cover on the backplate be removed on all but the top GPU for optimal airflow.

NVIDIA QuadroCoresCore MHzMemoryMem MHzMem BusTDPPrice
Quadro M6000307298812288MB6612384-bit250W~$5,000
Quadro K6000288090012288MB6000384-bit225W~$3,600
Quadro K520023046508192MB6000256-bit150W~$1,800
Quadro K420013447804096MB5400256-bit105W~$800
Quadro K220064010004096MB5000128-bit60W~$430
Quadro K120051210584096MB5000128-bit45W~$300
M6000 vs. K6000Quadro M6000Quadro K6000
ArchitectureMaxwellKepler
SP Performance7.0 TFLOPs5.2 TFLOPs
Memory Bandwidth317 GB/s288 GB/s
ECC MemoryYesYes
Power Connectors1x 8-pin2x 6-pin
Connectors4x DP 1.2
1x DVI-I
1x Stereo
2x DP 1.2
1x DVI-I
1x DVI-D
1x Stereo
4K/60 Support4 Displays2 Displays
Quadro SyncYes (Up to 16 displays)Yes (Up to 16 displays)
Max GPUs Per PC44
GPU Direct for VideoYesYes
Form Factor4.4″ x 10.5″4.4″ x 10.5″

PNY sells a single SKU of the Quadro M6000, which includes a stereo connector bracket, three DP to DVI-D SL adapters, a DVI to VGA adapter, as well as a dual 6-pin to 8-pin connector (in case the PSU used doesn’t offer an 8-pin connector.)

What iray & VCA Can Do

(Some of this section was borrowed from our GTC 2015 recap article.)

NVIDIA introduced its first VCA (Visual Computing Appliance) model at 2014’s GTC, and with the Quadro M6000’s launch, it’s been given an update.

NVIDIA Quadro M6000-equipped VCA

The latest VCA includes 8x Quadro M6000s, dual Intel Xeon E5 10-core 2.8GHz processors (leading me to believe these are still v2, not v3), 256GB of system memory, 12GB of VRAM per GPU, 2TB worth of SSD storage, dual 1Gbps Ethernet ports, dual 10Gbps Ethernet ports, and one InfiniBand port. Pre-installed software includes CentOS 6.6, VCA Manager, Iray 2014 3.4+, V-Ray 3.0+, and OptiX 3.8+.

With each Quadro M6000 retailing for about $5,000, the latest VCA at $50,000 could be considered well-priced given all of the extra hardware it bundles in, and the package it’s in. Like the original VCA, the new ones can be stacked, and from what I saw at the previous GTC, stacks of 4 have been commonly used in the real-world since the original launch. Even with K6000s at the helm, that’s an absurd amount of power – the type of power where a single heavily detailed ray traced scene could denoise itself to a great degree in mere seconds.

httpv://www.youtube.com/watch?v=8JItUtHwKiE

The above trailer is for an upcoming short film that’s rendered entirely using NVIDIA GPUs and Chaos V-Ray RT. I managed to catch a session at GTC to learn more, and I’m glad I did.

In 2014, director Kevin Margo’s real-time filming solution involved a BOXX PC equipped with a Quadro K6000 and dual Tesla K40s. Overall, the solution was quite good given the hardware, but the scenes rendered on the camera were hardly ideal given the amount of noise. Fast-forward to 2015, and Margo has performed the same filming duties while taking advantage of NVIDIA VCA cloud servers to dramatically improve the rendering time. Yup – 32 M6000s are quite a bit faster than Margo’s original tri-GPU setup.

You can check out the process with the following two videos, with the latter talking about the use of VCAs.

httpv://www.youtube.com/watch?v=nnaz8q6FLCk

httpv://www.youtube.com/watch?v=ihyRybQmmWc

After watching those, you should be able to better understand just how much faster GPUs and the VCAs can make the job of a CG filmmaker easier. In this scenario, they’ll have the option to both render a frame in real-time and view it on their camera before continuing filming, or run the recorded video in real-time before it’s rendered on a PC, and at any point pause it to render that particular frame so that things like lighting could be double-checked.

NVIDIA’s iray renderer isn’t new with the Quadro M6000, but it has been vastly improved, both from a features and performance standpoint (for the latter, check out the 3ds Max results page). With it, this physically based renderer can produce some stunning results. One example is seen below, and I recommend checking out Lightworks’ gallery page for more great examples.

Autodesk 3ds Max Scene - NVIDIA Iray+

I should note the fact that there are two versions of iray; a standard one which ships with 3ds Max and is available separately for Maya, Revit, and others, and iray+, an advanced version. Lightworks is the exclusive reseller of iray and the developer of iray+; you can review a full list of differences in Lightworks’ technical overview, but there are two big ones to note: iray+ allows you the ability to render using NVIDIA VCAs, and use interactive rendering. Outside of 3ds Max, the standard iray plugin added to other software will also allow you to take advantage of rendering to VCAs.

Interactive rendering is, in effect, iterative live renders. There are multiple modes for this, including fast, direct, preview, and photoreal (each is detailed in the aforementioned guide). As an example of its use, with an ActiveShade window open in an Autodesk product, you’ll be able to preview a scene in real-time, one that will begin rendering as soon as you pause the view. Why this is important is that it allows you to get quick basic results for a particular frame before you settle on that being the one you want. This makes it so you are able to manipulate the camera without lag to get the angle you want, let it run a few render iterations, and then decide whether or not further changes to the scene are needed.

Using Iray In An ActiveShade Window

Thanks to iray being a physically-based renderer, its use can be expanded upon even further. For example, if you want to create an advanced MAXScript, you’d be able to create a tool that lets you see how architecture is affected based on various real-world effects, like the sun. NVIDIA just so happens to have an example called “Death Ray” that highlights this capability.

Autodesk 3ds Max Scene - NVIDIA Iray+ - 20 Fenchurch Street

Designed by Uruguayan Rafael Viñoly, London’s “20 Fenchurch Street” sports quite an interesting design. Some have dubbed it the “Walkie Talkie” due to this design, and as humorous as that might be, there’s a darker consequence of its shape. What the building’s designer didn’t realize was that because the entire building was covered with glass and arced a bit inward, it would create a “Death Ray” if multiple factors aligned properly.

You might recall hearing about the Vdara hotel in Las Vegas sizzling folks in the pool when the sun hits the building at just the right angle, and if so, prepare to be surprised: Similar shape, same designer.

This is something a physically-based render can highlight before a building gets built. NVIDIA recreated London and the 20 Fenchurch Street building in 3ds Max, and developed a tool that would allow manipulation of the time of year, time of day, angle of the sun, and so forth. What you see in the below shot happened in real-life: The beam of light became so strong, that it began to melt the chassis of someone’s Jaguar.

Autodesk 3ds Max Scene - NVIDIA Iray+ - Death Ray

Given the fact that both 20 Fenchurch Street and Vdara prove what can go wrong in building design, we’ll (hopefully) see physically based renderers like iray become more relied-upon in the future.

Performance Testing The Quadro M6000

On the following pages, we’ll be putting NVIDIA’s latest Quadro through a gauntlet of real-world and synthetic tests, utilizing apps from Autodesk, Adobe, SPEC, SiSoftware, and a handful of others (including light gaming tests for good measure).

All tests are run at least twice to produce an accurate result, and if for some reason an odd result creeps up, we do a third run. In the case of this particular review, few tests had to go that route, as most of the benchmarks are very good at delivering similar results with each repeated run.

Our Windows 7 Ultimate x64 test OS has a couple of key Windows services disabled (Search, Defender, Firewall, and Update), and so is Aero. During all testing, the display is kept in 4K resolution, with two exceptions: SPECapc Maya 2012 and SPECviewperf are run with a 1080p resolution. Further, Vsync and G-SYNC are disabled through the NVIDIA Control Panel.

Our test system is as follows:

Techgage Workstation Test System
ProcessorIntel Core i7-5960X (8-core; 3GHz)
MotherboardASUS X99-DELUXE
MemoryCorsair Vengeance 32GB (8x4GB; DDR3-2133 11-12-11)
GraphicsNVIDIA GeForce GTX TITAN X 12GB (GeForce 353.30)
NVIDIA Quadro M6000 12GB (Quadro 353.30)
NVIDIA Quadro K5200 8GB (Quadro 353.30)
NVIDIA Quadro K5000 4GB (Quadro 353.30)
AudioOnboard
StorageKingston HyperX 3K 480GB SSD
Power SupplyCooler Master Silent Pro Hybrid 1300W
ChassisCooler Master Storm Trooper
CoolingThermaltake WATER3.0 Extreme Liquid
DisplaysAcer XB280HK 28″ 4K G-SYNC Monitor
Et ceteraWindows 7 Professional 64-bit

Without further ado, let’s get right to testing.

Rendering: Autodesk 3ds Max, OctaneBench, LuxMark & Cinebench

Autodesk 3ds Max 2016

Our 3ds Max testing utilizes the latest version of the suite, and with it, we render two complex scenes: a furnished room (2400×1200), and an Audi vehicle (2000×1500). Both renders make good use of NVIDIA’s iray renderer, so we chose to go with the iterative rendering option, limiting each scene to 2,500 iterations. For most scenes, that’s not going to result in production quality, but it’s more than suitable for the purposes of benchmarking.

Autodesk 3ds Max 2016
NVIDIA Quadro M6000 - Autodesk 3ds Max 2016

Where rendering with iray is concerned, it’s clear that Quadros don’t offer a driver-level performance advantage over GeForces. In this particular test, TITAN X came ahead just a wee bit, thanks to its slightly higher clock speed. Versus the previous-gen K5200 and K5000 Quadros, it’s not hard to see the enormous performance advantages with a card like the M6000.

Special Test: iray Renderer Performance In 3ds Max 2015 & 2016

Three major factors decide the ultimate performance when it comes to rendering: the hardware, the drivers, and of course, the renderer itself. If one of these components remain stagnant, upgrades to the other could exhibit nice gains. But if all of the components get a good polishing, the performance increases can be immense.

Cast in point: the iray renderer in 3ds Max 2016 performed so much better than the one in 2015, that if you happen to take good advantage of iray, it would be just as worthwhile to update from 2015 to 2016 as it would be to upgrade to a card like the M6000.

Autodesk 3ds Max 2015 vs 2016 iray Performance

Based on our two scenes, the updated iray renderer in 3ds Max 2016 can improve rendering times by 45~80%. That means less waiting, and far improved power-efficiency. Making this reality even more impressive is the fact that these boosts are not just seen with the Maxwell-based M6000, but also the Kepler-based K5200. The K6000, K4200, and et cetera, would enjoy similar gains.

Synthetic: Cinebench, Octane & LuxMark

To compare our collection of WS GPUs across other renderers, we rely on Cinebench, Octane, and LuxMark. The latter is of particular interest as it renders using OpenCL. It also happens to be so good at what it does that we opt to use it for the sake of generating peak temperature and power information.

Cinebench
LuxMark
OctaneBench
NVIDIA Quadro M6000 - Cinebench & OctaneBench
NVIDIA Quadro M6000 - LuxMark

In both Octane and LuxMark, the M6000 and TITAN X are equals, but the Quadro creeps ahead in Cinebench. Once again, since there are no driver-level optimizations that help boost Quadro cards in both Octane and LuxMark renderers, the TITAN X performs just a bit better due its higher clocks.

Encoding & CAD: Adobe Premiere Pro CC & Autodesk AutoCAD 2015

Adobe Premiere Pro CC (2015)

To test the accelerated encoding perks of different GPUs, we make use of the de facto video editing tool Adobe Premiere Pro. In the past, we would have included After Effects results, thanks to its ability to tap into CUDA for accelerated rendering of ray traced elements, but recent versions of that app have failed to update support for Maxwell. Instead, Adobe is preferring to target the renderer bundled with PP, Cinema 4D “lite”.

The three projects are: encoding a 4K RED-shot video to 1080i (w/ MRQ), encoding a music video project to 1080p (w/ MRQ), and the resulting H.264 encode time with PPBM9.

Adobe Premiere Pro 2015
NVIDIA Quadro M6000 - Adobe Premiere Pro CC (2015)

It’s clear that not all video projects are going to see great benefit from a faster GPU; our music video project hit a ceiling, for example, but is notably slower on the K5000. PPBM saw continual improvements, and there were considerable gains with the 4K > 1080p RED encode.

Autodesk AutoCAD 2015

For CAD testing, we’re taking advantage of the excellent Cadalyst benchmark. As a 2016 version of the benchmark isn’t yet available, we’re sticking with the 2015 version.

Autodesk AutoCAD 2015
NVIDIA Quadro M6000 - Cadalyst 2015

Yet again, both the M6000 and TITAN X are on equal footing, and both of those have great improvements in the 3D test – about 20% above the K5200.

SPEC: SPECapc 3ds Max & Maya, SPECviewperf & SPECwpc

When it comes to benchmarking hardware for serious use cases, there are no better people to turn to than those at SPEC. I like to call them the “masters of benchmarking”, as each one of their tools are meticulously crafted by professionals to deliver results as relevant and accurate as possible – a goal shared by us at Techgage.

For testing the performance of workstation cards, we take advantage of two SPECapc benchmarks – 3ds Max 2015 and Maya 2012 – as well as two that don’t require a standalone application: SPECviewperf and SPECwpc. While the Maya benchmark might be growing a little long in the tooth at this point, it still scales well with current GPUs.

SPECapc 3ds Max 2015

SPECapc 3ds Max 2015
NVIDIA Quadro M6000 - SPECapc 3ds Max 2015
M6000K5200K5000TITAN X
4K 0xAA (CPU)5.875.865.875.87
4K 4xAA (CPU)5.875.845.845.87
4K 8xAA (CPU)5.855.815.635.88
1080p 0xAA (CPU)5.895.905.905.88
1080p 4xAA (CPU)5.885.875.825.90
1080p 8xAA (CPU)5.845.865.905.88
4K 0xAA (Large Model)4.504.443.984.47
4K 4xAA (Large Model)4.392.832.464.44
4K 8xAA (Large Model)3.952.191.804.01
1080p 0xAA (Large Model)4.524.534.444.55
1080p 4xAA (Large Model)4.533.653.074.55
1080p 8xAA (Large Model)4.482.902.414.55

Thanks to the large 12GB framebuffer of the M6000 and TITAN X, higher resolutions with higher anti-aliasing options are possible. 4K 0xAA will offer great performance on either card, but 4xAA and especially 8xAA could prove to be a little too taxing in complex projects. In most scenarios, that wouldn’t be a memory limitation, but instead a performance one.

SPECapc Maya 2012

SPECapc Maya 2012
NVIDIA Quadro M6000 - SPECapc Maya 2012
M6000K5200K5000TITAN X
Shaded4.714.484.522.31
Shaded HQ6.796.085.171.68
Textured5.024.704.752.47
Textured HQ7.696.855.811.86
Wireframe4.013.813.892.45
Selected5.465.155.182.54
Highlighted5.294.975.012.69

We saw with our 3ds Max testing that Quadros might not perform better than GeForce gaming cards in iray rendering, but there are considerable differences when it comes to viewport performance. Here, the M6000 dramatically outperforms the TITAN X. Even the (now) modest K5000 does.

One thing that’s interesting to note is that because the GeForce card isn’t optimized for Maya, it actually manages to slow down the overall CPU performance as well. Both the M6000 and K5200 are about equals in that regard.

SPECviewperf 12

Whereas both SPECapc benchmarks used above stress a variety of different components of their respective tools, SPECviewperf’s target is singular: viewport performance. One reason I like this test is because it utilizes software we couldn’t otherwise test with (due to the lack of a license); namely CATIA, SolidWorks, and Siemens NX.

SPECviewperf 12
NVIDIA Quadro M6000 - SPECviewperf 12

It’s with a test like this where a major perk of proper workstation cards can be seen. It goes without saying that anyone using Siemens NX is going to be equipped with a workstation graphics card, and the results seen here prove why that will remain the case. Overall, thanks to the driver optimizations of Quadro, even the K5000 manages to outperform the TITAN X in Creo, NX, and SolidWorks.

A bit of an oddball result here is with Maya, as TITAN X somehow manages to outperform the M6000 (a repeated run yielded the same result). This is interesting because we saw the opposite kind of result with our SPECapc Maya 2012 test. There, the M6000 obliterated TITAN X. That particular test aside, the M6000 reigns supreme in CATIA, Creo, Energy, Medical, NX, and SolidWorks.

SPECwpc

The “w” in SPECwpc stands for “workstation”, and it acts as a bit of an “overall” testing suite. In some ways, it combines the goals of its other tests and combines them into a single benchmark. Thus, the results are split into six categories, and the result of one might matter more to some people than others.

SPECwpc
NVIDIA Quadro M6000 - SPECwpc

The gains seen here between the M6000 and K5200 are not what I’d consider stark, and I’d imagine if I had a K6000 to test with, the gains would seem even more lackluster. Exceptions would be with life sciences and energy. There are definite gains all-around, but nothing likely to warrant upgrading from the previous top-end king, the K6000.

Sandra: Processing, Cryptography, Scientific, Financial & Bandwidth

On the previous page, I mentioned that SPEC is an organization that crafts some of the best benchmarks going, and in a similar vein, I can compliment SiSoftware. This is a company that thrives on offering support for certain technologies before those technologies are even available to the consumer. In that regard, its Sandra benchmark might seem a little bleeding-edge, but at the same time, its tests are established, refined, and really accurate across multiple runs.

For the purposes of a workstation graphics card review, we focus on four main tests: general GPU processing, cryptography, financial analysis, and scientific analysis. Some of these tests produce complex results, so those will be displayed in a table rather than a graph.

SiSoftware Sandra

GPU Processing

Sandra 2015 – GPU Processing
M6000K5200K5000TITAN X
CUDA: Single-Float9.13 GPix/s4.16 GPix/s2.57 GPix/s9.40 GPix/s
OpenCL: Single-Float8.10 GPix/s3.37 GPix/s2 GPix/s7.75 GPix/s
CUDA: Half-Float9.05 GPix/s4.13 GPix/s2.57 GPix/s8.53 GPix/s
OpenCL: Half-Float8.2 GPix/s3.39 GPix/s2 GPix/s7.53 GPix/s
CUDA: Double-Float344.16 MPix/s272.68 MPix/s144 MPix/s348.07 MPix/s
OpenCL: Double-Float347.83 GPix/s268.22 MPix/s140 MPix/s351.54 MPix/s
CUDA: Quad-Float12.69 MPix/s11.54 MPix/s6 MPix/s12.83 MPix/s
OpenCL: Quad-Float13.59 MPix/s19.62 MPix/s5 MPix/s13.76 MPix/s
Results in pixels-per-second. 1 GPix = 1,000 MPix; 1 MPix = 1,000 kPix.

For the most part, the performance differences between CUDA and OpenCL processing are minimal, though it is notable that the latter is faster in the quad-float test. This is another test where the M6000 and TITAN X are close to being equals, but that’s not much of a surprise given NVIDIA wouldn’t optimize its drivers for synthetic tests like these. Compared to the previous-generation K5200, though, the performance differences are stark in both the single- and double-float tests.

Cryptography

NVIDIA Quadro M6000 - Sandra 2015 - Cryptography (High)
NVIDIA Quadro M6000 - Sandra 2015 - Cryptography (Higher)

This is another great test for showing the dramatic improvements Maxwell can offer over Kepler. I regret not being able to offer up K6000 results here, but comparing to the K5200, it doesn’t take much to understand that we’ve at least doubled performance from one generation to the next. The results are most impressive in the “Higher” test, which represents AES256 and SHA512 testing (versus AES256 + SHA2-256); the M6000 is 7x faster than the K5200 when using CUDA, and 3x faster when using OpenCL.

Financial Analysis

Sandra 2015 – Financial Analysis (Single Precision)
M6000K5200K5000TITAN X
CUDA: Black-Scholes8.14 G/s3.44 G/s1.47 G/s8.14 G/s
OpenCL: Black-Scholes8.10 G/s4.49 G/s1.48 G/s8.11 G/s
CUDA: Binomial1.58 M/s676.64 k/s381.43 k/s1.54 M/s
OpenCL: Binomial1.60 M/s645.42 k/s379.64 k/s1.53 M/s
CUDA: Monte Carlo3 M/s1.20 M/s771.30 k/s3 M/s
OpenCL: Monte Carlo2.81 M/s1.18 M/s689.37 k/s2.67 M/s
Results in options-per-second. 1 GOPS = 1,000 MOPS; 1 MOPS = 1,000 kOPS.
Sandra 2015 – Financial Analysis (Double Precision)
M6000K5200K5000TITAN X
CUDA: Black-Scholes700 M/s541.32 M/s286.48 M/s705.90 M/s
OpenCL: Black-Scholes691.82 M/s533.91 M/s266.76 M/s699.64 M/s
CUDA: Binomial70.32 k/s52.55 k/s28.75 k/s71.14 k/s
OpenCL: Binomial71.45 k/s52.93 k/s28.79 k/s72.48 k/s
CUDA: Monte Carlo147.71 k/s112.53 k/s58.53 k/s149.18 k/s
OpenCL: Monte Carlo147.79 k/s112.43 k/s58.57 k/s149.16 k/s
Results in options-per-second. 1 GOPS = 1,000 MOPS; 1 MOPS = 1,000 kOPS.

It’s here where the results become complex really fast. While some of the performance is measured in the thousands of options-per-second, some is measured in the millions – that’s an obvious problem when trying to sort it all in a graph.

Nonetheless, as with the cryptography test we can see some great performance improvements over the K5200, and also repeating itself is the fact that CUDA and OpenCL performance is quite close.

Scientific Analysis

Sandra 2015 – Scientific Analysis (Single Precision)
M6000K5200K5000TITAN X
CUDA: GEMM3.2 TFLOPS1.1 TFLOPS83.2 GFLOPS3.2 TFLOPS
OpenCL: GEMM3.6 TFLOPS1 TFLOPS374.1 GFLOPS3.4 TFLOPS
CUDA: FFT204.3 GFLOPS80.8 GFLOPS71.4 GFLOPS205 GFLOPS
OpenCL: FFT220.7 GFLOPS97.0 GFLOPS81 GFLOPS221.5 GFLOPS
CUDA: NBDY2.9 TFLOPS1 TFLOPS718.3 GFLOPS2.9 TFLOPS
OpenCL: NBDY3 TFLOPS1 TFLOPS622 GFLOPS2.9 TFLOPS
Results in floating-point operations-per-second. GEMM = General Matrix Multiply; FFT = Fast Fourier Transform; NBDY = N-Body Simulation.
Sandra 2015 – Scientific Analysis (Double Precision)
M6000K5200K5000TITAN X
CUDA: GEMM175.1 GFLOPS147.8 GFLOPS10.6 GFLOPS177.2 GFLOPS
OpenCL: GEMM174.6 GFLOPS148.0 GFLOPS28.2 GFLOPS176.4 GFLOPS
CUDA: FFT89.1 GFLOPS48.7 GFLOPS18.5 GFLOPS89.3 GFLOPS
OpenCL: FFT120.3 GFLOPS58.6 GFLOPS22.5 GFLOPS120.9 GFLOPS
CUDA: NBDY103.0 GFLOPS112.1 GFLOPS63.3 GFLOPS101.4 GFLOPS
OpenCL: NBDY103.6 GFLOPS111.9 GFLOPS63.4 GFLOPS105.0 GFLOPS
Results in floating-point operations-per-second. GEMM = General Matrix Multiply; FFT = Fast Fourier Transform; NBDY = N-Body Simulation.

Wrapping up our Sandra testing is a set of results that backs up what we’ve seen with the others so far on this page: the M6000 is on par with the TITAN X overall in non-optimized applications, and in most cases, it’s dramatically faster than the K5200.

Gaming: Futuremark 3DMark & Unigine Heaven

While workstation graphics cards have a minor focus on gaming, high-end models are capable of delivering great-looking and great performing gameplay. NVIDIA’s Quadro M6000 is powerful enough where its users could rely on it for gaming – it’s a GeForce TITAN X at its core, after all.

So what’s the caveat? A lack of optimizations. While on the GeForce side, NVIDIA constantly rolls out updates that improve general performance in gaming or performance specific to one title, Quadro drivers don’t have such granularity where gaming’s concerned – or, if they do, I haven’t been able to get explicit confirmation of it.

I’d be surprised if multiple-GPUs would even scale in SLI for gaming, but as I’ve never had dual Quadros to test out the theory, I can’t say for certain. A GeForce driver is able to be installed with Quadro hardware, but it’s smart enough to detect that and then install itself as a Quadro driver. It’s a bit strange that this is even possible, but nonetheless, it means there will be no need to use a second OS for the sake of gaming.

As we can see with both 3DMark and Unigine Heaven, the M6000 is a very capable gaming GPU.

Futuremark 3DMark
NVIDIA Quadro M6000 - Futuremark 3DMark
Unigine Heaven
NVIDIA Quadro M6000 - Unigine Heaven

The fact that the M6000 performs as an equal to the TITAN X in these tests leads me to believe that the Quadro drivers do include basic optimizations that improve gaming performance as new drivers are released. Quite simply, the fact that the M6000 performs like a TITAN X in these tests means that 4K gaming will be possible, and with good detail settings, too.

If time permits, I’ll conduct special tests to see how this Quadro performs across a handful of games versus a TITAN X gaming card. As I see it now, though, the M6000 will definitely allow you to take a break from work to indulge in some high-performance gaming.

Power, Temperatures & Final Thoughts

To test workstation graphics cards for both their power consumption and temperature at load, we utilize a couple of different tools. On the hardware side, we use a trusty Kill-a-Watt power monitor which our GPU test machine plugs into directly. For software, we use LuxMark to stress the card, and GPU-Z to record the temperatures.

To test, the area around the chassis is checked with a temperature gun, with the average temp recorded. Once that’s established, the PC is turned on and left to sit idle for five minutes. At this point, we open GPU-Z along with LuxMark. After its initial (automatic) render is complete, we kick off a 15 minute stress-test. Following this, we monitor the Kill-a-Watt for a minute to establish peak load wattage.

NVIDIA Quadro M6000 - Temperatures
NVIDIA Quadro M6000 - Power Usage

Despite the fact that the Quadro M6000 and GeForce TITAN X are the same GPU at the core, the former manages to run a bit cooler at both idle and load. Perhaps not as interesting is the fact that it draws a few watts less – it does have a slightly slower clock speed, after all. However, it’s worth bearing in mind that the M6000 requires just one power connector (8-pin) versus the TITAN X’s two (8-pin + 6-pin). I am sure system builders are pleased about that design.

Final Thoughts

It’s not that hard to sum-up NVIDIA’s Quadro M6000. As mentioned at the outset, it’s a beast of a card. Whereas GeForce TITAN X is the current king of the gaming side, the Quadro M6000 stands tall and proud at the top of the workstation GPU pile. That’s not to say it’s without fault, but it has far more going good for it than bad.

The most notable downside with the M6000, and the Maxwell architecture in general, is that it’s not to be targeted by those who require useful double-precision performance. Kepler remains the go-to architecture for that, which is the reason PNY doesn’t yet consider the K6000 to be an EOL product. On the GeForce side, the original TITAN, TITAN Black, and dual-GPU TITAN Z all deliver fantastic DP performance; on the Tesla side, the K20~K80 do.

NVIDIA Quadro M6000 - Glamour Shot

An obvious caveat of NVIDIA’s top-end Quadro (and Tesla, for that matter), is that its price tag will keep it out of the hands of most small operations. For the $5,000 asking price, you’ll need to do some cost analysis to gauge the ultimate worth of such a high-end model. On account of the <K6000 cards being released just this past fall, it seems unlikely we’ll see other Maxwell-based Quadros that soon. And unless AMD pulls a fierce competitor out of its hat with its next-gen GCN-based cards, the M6000 could remain on top for quite some time.

Pros

Cons

NVIDIA Quadro M6000 Workstation Graphics Card - Techgage Editor's Choice
NVIDIA Quadro M6000 Workstation Graphics Card

Copyright © 2005-2017 Techgage Networks Inc. - All Rights Reserved.