Date: February 14, 2017
Author(s): Rob Williams
NVIDIA’s latest and greatest-ever workstation graphics card has arrived: Quadro P6000. This top-tier card is built around NVIDIA’s Pascal architecture, which is produced on a 16nm FinFET process. The card boasts an impressive 3,840 CUDA cores, and not to mention 24GB of super-fast GDDR5X. Let’s check it out.
When NVIDIA released its Pascal GeForce series last spring and delivered downright impressive performance, we knew that the company’s Pascal Quadros were going to be something special. And well, the P6000 in particular does prove to be a very special card indeed, for a multitude of reasons.
Considering the fact that NVIDIA’s Maxwell-based Quadro M6000 shared similar specs with the first-gen GeForce TITAN X, it’s easy to jump to conclusions and assume that the P6000 is spec-comparable to the second-gen TITAN X. Well, the two cards are in fact similar, but NVIDIA managed to cram an additional 256 CUDA cores into the P6000, giving it a slight performance boost and securing its right to bear the title: “Fastest NVIDIA GPU Ever!”
As covered last week, NVIDIA has just fleshed out its entire Pascal-based Quadro lineup, now offering options to fit all budgets. The Quadro P6000 sits proud at the top, and like previous generation top-tier Quadros, the P6000 is priced at around $5,000 USD, with Newegg currently offering it for $5,400.
Despite being available for a couple of months now, the P6000 remains difficult to find at etail. Newegg seems to be an exception here; Amazon doesn’t offer a single Pascal Quadro at the moment. System builders like BOXX do, but with the warning of “extended lead time”. So while the P6000 is undeniably the fastest GPU NVIDIA has ever crafted, it might take a little bit of time to acquire.
Nonetheless, let’s take a harder look at what we’re dealing with:
|NVIDIA Pascal Quadro Roundup|
|Cores||Core MHz||Memory||Mem MHz||Mem Bus||TDP|
|Quadro GP100||3584 (FP32)|
|Quadro P6000||3840||1417||24GB 2||9008||384-bit||250W|
|Quadro P5000||2560||1607||16GB 2||9008||256-bit||180W|
|Quadro P4000||1792||TBD||8GB 3||TBD||TBD||TBD|
|Quadro P2000||1024||TBD||5GB 3||TBD||TBD||TBD|
|Quadro P1000||640||TBD||4GB 3||TBD||TBD||TBD|
|Quadro P600||384||TBD||2GB 3||TBD||TBD||TBD|
|Quadro P400||256||TBD||2GB 3||TBD||TBD||TBD|
|1 HBM2; 2 GDDR5X; 3 GDDR5|
To address the elephant in the room, the Quadro GP100 is different from the P6000 in its focus (and price; I’d expect the GP100 to cost at least 25% more). The GP100 is unique in that it bundles in dedicated CUDA cores for ultra-fast double-precision floating-point performance. Whereas the P6000 musters ~375 GFLOPS of DP performance, the GP100 stomps that with its 5 TFLOPS.
It’s also worth noting that the GP100 is ideal for those seeking out fast half-precision performance, as it boasts the incredible promise of 20 TFLOPS FP16 (2xFP32). Since the P6000 is a GP102 chip, it doesn’t have the same FP16 scaling, and in fact, the half-precision performance is 1/64 of its FP32 rate, or roughly 187 GFLOPS – yes, half the performance of its FP64 rating.
That all said, the GP100 is designed almost as a solution for those who require both a high-end Quadro and a high-end Tesla, where market-leading compute isn’t just needed, but also huge graphics performance.
The P6000 still does have one trick up its sleeve, though, and that’s 256 CUDA cores over the GP100. That means that for typical Quadro workloads, the P6000 is going to be faster overall. It’s when compute becomes an important requirement that the GP100 should be opted for instead.
The table below helps illustrate the improvements NVIDIA’s made to its top-end Quadro over the past couple of generations. Both the K6000 and M6000 included 12GB of VRAM at launch, although the second-gen M6000 bumped that to 24GB, preemptively matching the P6000. Both single- and double-precision performance have seen significant increases with each new generation, and the same applies to the chip’s complexity.
|NVIDIA Quadro Generational Improvements|
|Quadro P6000||16nm||250W||12 TFLOPS||375 GFLOPS||24GB||12 Billion|
|Quadro M6000||28nm||250W||7 TFLOPS||190 GFLOPS||12GB||8 Billion|
|Quadro K6000||28nm||225W||5.2 TFLOPS||173 GFLOPS||12GB||7.1 Billion|
Like the Quadro M6000, the P6000 includes 4x DisplayPort connectors in addition to a single DVI-D connector. A single card can support: 8K @ 30Hz, 5K @ 60Hz, and 4K @ 60Hz. I am not sure if multiple 8K monitors can be used off of a single card, but NVIDIA does give explicit support for 5K and 4K x 4.
PNY’s Quadro P6000 includes 3x DP-to-DVI adapters, a stereo extension card, and in case your power supply doesn’t include an 8-pin connector, a dual 6-pin to 8-pin adapter.
Alongside the Quadro P6000 is an update to another piece of NVIDIA gear: Quadro Sync. With Quadro Sync II, users can combine the efforts of up to four GPUs to make certain that the frames outputted to their displays are in perfect sync. In the vast majority of usage cases where multiple displays (or even windows) are used, an absolute perfect sync might not matter, but there are other use cases – like broadcast – where it’s imperative.
Before it became a gaming technology part of NVIDIA’s GeForce line, Quadro Sync used to be called “G-SYNC”. Whereas on the gaming side, monitors with G-SYNC technology baked-in are required (along with an NVIDIA graphics card), Quadro Sync II can synchronize frames regardless of the monitor model. The card calls the shots; not the monitors. Tying further into the broadcast example, the Sync II card can also be used to generate a house sync, saving you money if you don’t already own a sync generator (but need one).
Before moving into performance, there are a couple of other quick things to mention. The memory solution on the P6000, and also the P5000, is super-fast GDDR5X, much like it is with NVIDIA’s top-end Pascal gaming cards. On these Quadros, though, users are able to enable ECC mode if it’s needed (or simply desired).
While it hasn’t been covered up to this point, the VR push on the latest Quadros is in overdrive, with NVIDIA trying to prove that VR will be huge in the enterprise space – something I agree with. Over the past year, I’ve experienced a handful of VR demos, some revolving around Iray, and after spending just a few moments with each, it’s not hard to understand what kind of impact VR can have for product or video creation, or even architecture, for that matter. With NVIDIA’s annual GPU Technology Conference set to take place this May, I’m sure we’ll be finding many cool examples of this there.
On the following pages, we’ll be putting NVIDIA’s latest top-end Quadro through a gauntlet of real-world and synthetic tests, utilizing apps from Autodesk, Adobe, SPEC, SiSoftware, and a handful of others (including light gaming tests for good measure).
All tests are run at least twice to produce an accurate result, and if for some reason an odd result creeps up, we do a third run. In the case of this particular review, no tests had to go that route, as most of the benchmarks are very good at delivering similar results with each repeated run.
Our Windows 7 Ultimate x64 test OS has a couple of key Windows services disabled (Search, Defender, Firewall, and Update), as well as Aero. During all testing, the display is kept in 4K resolution, with two exceptions: SPECapc Maya 2012 and SPECviewperf are run with a 1080p resolution. Further, Vsync, G-SYNC, and FreeSync are disabled.
Our test system is as follows:
|Techgage Workstation Test System|
|Processor||Intel Core i7-5960X (8-core; 3GHz)|
|Memory||Corsair Vengeance 32GB (8x4GB; DDR3-2133 11-12-11)|
|Graphics||NVIDIA GeForce GTX TITAN X 12GB (GeForce 353.30)|
NVIDIA Quadro P6000 24GB (Quadro 376.62)
NVIDIA Quadro M6000 12GB (Quadro 352.86)
NVIDIA Quadro M2000 4GB (Quadro 362.13)
NVIDIA Quadro K5200 8GB (Quadro 353.30)
NVIDIA Quadro K5000 4GB (Quadro 353.30)
AMD Radeon Pro WX 5100 8GB (16.12.1)
AMD Radeon Pro WX 4100 4GB (16.12.1)
AMD FirePro W4300 4GB (FirePro 15.201)
|Storage||Kingston HyperX 3K 480GB SSD|
|Power Supply||Cooler Master Silent Pro Hybrid 1300W|
|Chassis||Cooler Master Storm Trooper|
|Cooling||Thermaltake WATER3.0 Extreme Liquid|
|Displays||Acer XB280HK 28″ 4K G-SYNC Monitor|
|Et cetera||Windows 7 Professional 64-bit|
With that all covered, it’s time to jump right into the test results.
Our 3ds Max testing takes advantage of the suite’s latest version, 2017, and with it, we render three complex scenes: the interior of a room and an Audi automobile, both using Iray, and a second room interior, using Iray+.
Because 3ds Max 2017 doesn’t support NVIDIA’s Pascal architecture out-of-the-box, an official Autodesk plugin had to be installed, which conveniently came out just at the start of this month. If you’re using the latest version of 3ds Max and want Pascal support, hit up the Product Updates section in your Autodesk account dashboard. For Iray+, testing was done using the latest version of Lightwork’s plugin (1.30).
Despite having a huge performance advantage over the Quadro M6000, the P6000 doesn’t decimate that card’s performance quite like I expected. From a pure throughput perspective, the P6000 is about 71% faster than the M6000 (something later benchmarks will agree with), but it proves just about ~35% quicker here.
Given what I’ve seen from Iray scaling in the past, I’m simply led to believe that the plugins are not taking as much advantage of Pascal as they could. I’ll be revisiting Iray performance in a month or two, as I’ll be overhauling our test suite with updated tests (and test OS).
To compare our collection of workstation GPUs across other renderers, we rely on Cinebench and LuxMark. The latter is of particular interest as it renders using OpenCL. It also happens to be so good at what it does that we opt to use it for the sake of generating peak temperature and power information.
In the 3ds Max test at the top of the page, I mentioned that Pascal might not have perfect support for Iray right now, which is a bit of a theme with a brand-new architecture like this. As part of the Pascal launch, a new version of CUDA was also released, and the supporting software has yet to take full advantage of it so far. With that in mind, you can now probably take one guess why I didn’t include OctaneBench here!
In talking to OTOY, I was told that a new OctaneBench is en route, so once it drops, I’ll integrate it back into our testing.
Cinebench is one benchmark that’s growing long in the tooth, as both the M6000 and P6000 scored the same (likely a CPU bottleneck at this point), despite our gaming tests (coming up) showing great scaling between the two. In LuxMark, a test that exercises GPUs to their fullest, the P6000 dominates, becoming the first graphics card in our arsenal to ever breach 20,000 on the main LuxBall render.
The remaining LuxMark results highlight the fact that the P6000 can deliver even better results when the going gets tough. In the Hotel Lobby render, the most gruelling of the bunch, the P6000 actually manages to deliver a score more than twice that of the M6000. This is a great example of how much faster Pascal-based Quadros can be when they’re properly utilized.
To test the accelerated encoding perks of different GPUs, we make use of the de facto video editing tool Adobe Premiere Pro. In the past, we would have included After Effects results, thanks to its ability to tap into CUDA for accelerated rendering of ray traced elements, but recent versions of that app have failed to update support for Maxwell. Instead, Adobe is preferring to target the renderer bundled with PP, Cinema 4D “lite”.
It’s with this testing that I found the P6000 a little “too” powerful, as it simply didn’t exhibit real gains over the M6000, which is silly for a product almost twice as fast. NVIDIA was kind enough to ship over some updated workloads that help a bit with that, including dual RED encodes, and also a megamix of sorts.
With straight-forward video encodes, such as with the RED projects here, the gains are small (8~13%). But when the project grows larger and effects are tossed in, the deltas can increase quite a bit, with the P6000 proving about 40% faster than the M6000.
For CAD testing, we’re taking advantage of the excellent Cadalyst benchmark.
If there’s one target application NVIDIA wouldn’t point its Quadro P6000 towards, it’s AutoCAD, and the results above can help explain why. Here, the M6000 and P6000 are considered equals, even though the 3D performance of the P6000 in most other tests would say otherwise.
Most CAD use is not going to exhibit huge gains on the 3D front. Even the lowbie Quadro M2000 performs admirably here. Higher-end CAD solutions should show much greater performance enhancements. And speaking of, we have SPECviewperf on the following page to help us see proof of that.
When it comes to benchmarking hardware for serious use cases, there are no better people to turn to than those at SPEC. I like to call them the “masters of benchmarking”, as each one of their tools are meticulously crafted by professionals to deliver results as relevant and accurate as possible – a goal shared by us at Techgage.
For testing the performance of workstation cards, we take advantage of two SPECapc benchmarks – 3ds Max 2015 and Maya 2012 – as well as two that don’t require a standalone application: SPECviewperf and SPECwpc. While the Maya benchmark might be growing a little long in the tooth at this point, it still scales well with current GPUs.
|1080p 0xAA (CPU)||5.88||5.89||5.88||5.90|
|1080p 4xAA (CPU)||5.88||5.88||5.87||5.88|
|1080p 8xAA (CPU)||5.88||5.84||5.87||5.88|
|1080p 0xAA (Large Model)||4.61||4.52||4.59||4.30|
|1080p 4xAA (Large Model)||4.58||4.53||3.64||2.82|
|1080p 8xAA (Large Model)||4.59||4.48||3.29||2.75|
|4K 0xAA (CPU)||5.85||5.87||5.88||5.83|
|4K 4xAA (CPU)||5.85||5.87||5.88||5.81|
|4K 8xAA (CPU)||5.85||5.85||–||5.69|
|4K 0xAA (Large Model)||4.61||4.50||4.59||3.88|
|4K 4xAA (Large Model)||4.56||4.39||2.61||2.24|
|4K 8xAA (Large Model)||4.54||3.95||–||2.12|
SPECwpc 3ds Max 2015 doesn’t take advantage of NVIDIA’s Iray, so the test gives us a great second look at general performance in the application, both from the viewport performance to the rendering performance. Overall, the P6000 proves dominant, but as we’ve seen a few times already, the P6000 is almost too powerful for certain workloads.
I admit that these results surprised me a bit. I’ve mentioned a couple of times already that some workloads are simply not strong enough to take full advantage of the P6000, but here we have a five-year-old test that manages to show further improvement on NVIDIA’s latest and greatest. It’s not a major gain, but neither was the gain between the Kepler-based K5200 and Maxwell-based M6000.
Whereas both SPECapc benchmarks used above stress a variety of different components of their respective tools, SPECviewperf’s target is singular: viewport performance. One reason I like this test is because it utilizes software we couldn’t otherwise test with (due to the lack of a license); namely CATIA, SolidWorks, and Siemens NX.
Here is where we begin to see NVIDIA’s Quadro P6000 show the rest of our lineup just who’s boss. In most of the tests, there are considerable gains seen with the P6000. Whereas an earlier test showed a 30% gain at best, that’s the starting point of the gains here. In particular, the Medical, Energy, and Showcase tests show huge jumps of about 70~75%. The high-end CAD suites CATIA, SolidWorks, and SNX show gains of about 35~40%.
The “w” in SPECwpc stands for “workstation”, and it acts as a bit of an “overall” testing suite. In some ways, it combines the goals of its other tests and combines them into a single benchmark. Thus, the results are split into six categories, and the result of one might matter more to some people than others.
From the bottom to the top, SPECwpc doesn’t show huge deltas between one card and the next, so the P6000 has a hard time strutting its stuff here. Nonetheless, the card still does give us notable gains in most tests.
On the previous page, I mentioned that SPEC is an organization that crafts some of the best benchmarks going, and in a similar vein, I can compliment SiSoftware. This is a company that thrives on offering support for certain technologies before those technologies are even available to the consumer. In that regard, its Sandra benchmark might seem a little bleeding-edge, but at the same time, its tests are established, refined, and really accurate across multiple runs.
For the purposes of a workstation graphics card review, we focus on four main tests: general GPU processing, cryptography, financial analysis, and scientific analysis. Some of these tests produce complex results, so those will be displayed in a table rather than a graph.
|Sandra 2015 – GPU Processing|
|CUDA: Single-Float||17.38 GPix/s||9.13 GPix/s||4.16 GPix/s||2.48 GPix/s|
|OpenCL: Single-Float||15.4 GPix/s||8.10 GPix/s||3.37 GPix/s||2.19 GPix/s|
|CUDA: Half-Float||17.26 GPix/s||9.05 GPix/s||4.13 GPix/s||2.47 GPix/s|
|OpenCL: Half-Float||15.45 GPix/s||8.2 GPix/s||3.39 GPix/s||2.19 GPix/s|
|CUDA: Double-Float||646.59 MPix/s||344.16 MPix/s||272.68 MPix/s||92.89 MPix/s|
|OpenCL: Double-Float||646.76 MPix/s||347.83 GPix/s||268.22 MPix/s||185.1 MPix/s|
|CUDA: Quad-Float||27.24 MPix/s||12.69 MPix/s||11.54 MPix/s||4 MPix/s|
|OpenCL: Quad-Float||25.19 MPix/s||13.59 MPix/s||19.62 MPix/s||8.37 MPix/s|
|Results in pixels-per-second. 1 GPix = 1,000 MPix; 1 MPix = 1,000 kPix.|
In some of the tests on the previous pages, the P6000 has struggled to shine, but Sandra is having none of that. In raw throughput, the P6000 is roughly double the performance of the M6000. In some cases, it’s 88% faster, and with the quad-float CUDA test, the P6000 actually manages to be more than twice as fast (114%).
The awesome results keep coming for the Quadro P6000. Overall, it’s safe to say that the P6000 is twice as fast where encryption is concerned. I’m not sure of the reason for the specific gain, but CUDA hashing sees dramatic improvement on Pascal. Further testing showed that NVIDIA’s own driver improvements had some hand in these increases, but the architectural boost played the largest role.
|Sandra 2015 – Financial Analysis (Single Precision)|
|CUDA: Black-Scholes||11.62 G/s||8.14 G/s||3.44 G/s||2.12 G/s|
|OpenCL: Black-Scholes||11.54 G/s||8.10 G/s||4.49 G/s||1.58 G/s|
|CUDA: Binomial||3 M/s||1.58 M/s||676.64 k/s||445.48 k/s|
|OpenCL: Binomial||3.15 M/s||1.60 M/s||645.42 k/s||375.33 k/s|
|CUDA: Monte Carlo||6.49 M/s||3 M/s||1.20 M/s||883.6 k/s|
|OpenCL: Monte Carlo||6.42 M/s||2.81 M/s||1.18 M/s||756.45 k/s|
|Results in options-per-second. 1 GOPS = 1,000 MOPS; 1 MOPS = 1,000 kOPS.|
|Sandra 2015 – Financial Analysis (Double Precision)|
|CUDA: Black-Scholes||1.33 G/s||700 M/s||541.32 M/s||193.91 M/s|
|OpenCL: Black-Scholes||1.3 G/s||691.82 M/s||533.91 M/s||235.91 M/s|
|CUDA: Binomial||131.83 k/s||70.32 k/s||52.55 k/s||19 k/s|
|OpenCL: Binomial||132 k/s||71.45 k/s||52.93 k/s||15.79 k/s|
|CUDA: Monte Carlo||272.54 k/s||147.71 k/s||112.53 k/s||40 k/s|
|OpenCL: Monte Carlo||272.62 k/s||147.79 k/s||112.43 k/s||35.86 k/s|
|Results in options-per-second. 1 GOPS = 1,000 MOPS; 1 MOPS = 1,000 kOPS.|
The P6000 continues to impress here, with varying degrees of improvement being seen from test to test, but with all of the improvements being substantial. The OpenCL Monte Carlo test, for example, exhibited a 128% performance boost on the P6000, versus the M6000 (which is still a seriously powerful GPU!)
|Sandra 2015 – Scientific Analysis (Single Precision)|
|CUDA: GEMM||5.53 TFLOPS||3.2 TFLOPS||1.1 TFLOPS||951.73 GFLOPS|
|OpenCL: GEMM||6.81 TFLOPS||3.6 TFLOPS||1 TFLOPS||983.37 GFLOPS|
|CUDA: FFT||261.88 GFLOPS||204.3 GFLOPS||80.8 GFLOPS||54.77 GFLOPS|
|OpenCL: FFT||268.44 GFLOPS||220.7 GFLOPS||97.0 GFLOPS||65.24 GFLOPS|
|CUDA: NBDY||5.78 TFLOPS||2.9 TFLOPS||1 TFLOPS||915.53 GFLOPS|
|OpenCL: NBDY||5 TFLOPS||3 TFLOPS||1 TFLOPS||601.82 GFLOPS|
|Results in floating-point operations-per-second. GEMM = General Matrix Multiply; FFT = Fast Fourier Transform; NBDY = N-Body Simulation.|
|Sandra 2015 – Scientific Analysis (Double Precision)|
|CUDA: GEMM||325 GFLOPS||175.1 GFLOPS||147.8 GFLOPS||48.11 GFLOPS|
|OpenCL: GEMM||325.11 GFLOPS||174.6 GFLOPS||148.0 GFLOPS||49.64 GFLOPS|
|CUDA: FFT||111.38 GFLOPS||89.1 GFLOPS||48.7 GFLOPS||28 GFLOPS|
|OpenCL: FFT||131.79 GFLOPS||120.3 GFLOPS||58.6 GFLOPS||36.16 GFLOPS|
|CUDA: NBDY||189.8 GFLOPS||103.0 GFLOPS||112.1 GFLOPS||38.17 GFLOPS|
|OpenCL: NBDY||190.25 GFLOPS||103.6 GFLOPS||111.9 GFLOPS||51.18 GFLOPS|
|Results in floating-point operations-per-second. GEMM = General Matrix Multiply; FFT = Fast Fourier Transform; NBDY = N-Body Simulation.|
To help wrap up our Sandra results, we have more proof that the Quadro P6000 is a really, really fast card. In the worst case, gains of 25% can be seen; in the best case, 99% (CUDA N-Body).
Gaming is generally not a big focus for professional GPU lines, but the fact of the matter is, they can game.
That especially applies to the top-tier cards on the market, as they all perform similarly to the top-tier gaming cards from the same vendor of the same generation.
So what’s the caveat with gaming on workstation cards? A lack of game-specific optimizations.
While on the GeForce or Radeon (non-Pro) side, the companies constantly roll out updates that improve general performance in gaming or performance specific to one title, Quadro and Radeon Pro drivers don’t have such granularity where gaming’s concerned.
To get a quick gauge on the performance of our workstation GPU collection in gaming, we use Futuremark’s 3DMark and Unigine’s Heaven.
According to these gaming benchmarks, the Quadro P6000 is about 60~64% faster than the M6000. I would not be surprised if select scenarios would exhibit even greater gains, and this is something I plan to evaluate more in a couple of months as we look to overhaul our test suite.
To test workstation graphics cards for both their power consumption and temperature at load, we utilize a couple of different tools. On the hardware side, we use a trusty Kill-a-Watt power monitor which our GPU test machine plugs into directly. For software, we use LuxMark to stress the card, and GPU-Z to record the temperatures.
To test, the area around the chassis is checked with a temperature gun, with the average temp recorded. Once that’s established, the PC is turned on and left to sit idle for five minutes. At this point, we open GPU-Z along with LuxMark. After its initial (automatic) render is complete, we kick off a 15 minute stress-test. Following this, we monitor the Kill-a-Watt for a minute to establish peak load wattage.
These results highlight some improvements I love to see from generation to generation. The Quadro P6000 is just about twice as fast as the M6000, yet it ran 4°C cooler in this stress test. And, if that wasn’t enough, it also drew 32W less at full load. You’ve gotta love progress.
When we’re handed the fastest product ever released in a certain product category, drumming up a conclusion isn’t too difficult. Nothing changes here. As it stands today, the Quadro P6000 is the fastest GPU NVIDIA’s ever produced; a 12 TFLOP monster in a single-GPU form-factor. And despite its huge performance, the P6000 draws less power than last-gen’s M6000, and it even runs a bit cooler, to boot.
Leading up to this review, I put a considerable number of hours into benchmarking the P6000, even more than I spent on the M6000. When I tackled the M6000, it came at a time when all of the software I tested with supported NVIDIA’s Maxwell architecture. I didn’t have the same luxury here with Pascal, as OctaneBench’s support for the architecture is coming soon, and the Iray performance I saw didn’t scale as well as I expected it to (even though considerable gains were seen).
As with all workstation graphics cards, understanding your needs and wants is of utmost performance. In some cases, the P6000 isn’t much (or any) faster than the last-gen M6000, despite its performance being spec’d 71% better. In AutoCAD, we saw that there is a definite point of diminishing returns. In 3D tests, the P6000 just about decimated the M6000.
Ultimately, the Quadro P6000 is cutting-edge hardware, and it requires software to catch up, either by way of the plugins or the entire suite. 3ds Max didn’t officially support Pascal until 10 or so days ago, and as mentioned above, OctaneBench (which is built around CUDA) has a new version coming.
Conveniently, I’ll be rebenchmarking our fleet of workstation graphics cards in a month or two, as a dedicated PC is going to be built around their testing (up to this point I’ve used our gaming GPU testbed as a dual-purpose machine), which will also bring us up to speed on the OS front (Windows 7 grew long in the tooth ages ago). At that time, performance gains on Pascal are likely to be notable.
While the gains seen in real-world tests varied, SiSoftware’s Sandra helped bring some sanity to our results madness. With it, we saw dramatic gains in performance in single-, double-, half-, and quad-float, and that carried on over to the more specific financial, scientific, and crypto tests. In some cases, the P6000 performed better than 2x over the M6000, although a year-and-a-half worth of driver improvements helped out a bit with that as well.
At the end of the day, the Quadro P6000 is the fastest Quadro ever created, and in fact the fastest GPU the company’s ever created. NVIDIA has provided the hardware; you just need to provide the software and workloads to take full advantage of this beast.
Copyright © 2005-2017 Techgage Networks Inc. - All Rights Reserved.