NVIDIA Announces Tesla K40 GPU Accelerator, 20~40% Faster Than K20X

Posted on November 18, 2013 10:00 AM by Rob Williams
We’re in the midst of supercomputing season at the moment, and that means one thing: Lots of announcements. As usual, NVIDIA has ownership to many of those, some of which we’re going to cover a bit later. To start off, I want to talk about what I’m sure most of you will want to hear about first: The new GPU accelerator, called Tesla K40.

You might recall that NVIDIA’s last big Tesla launch happened last November, where three new GK1x0 models were released. This time around, only a single model is being unveiled, so it’s clear that at least the K10 or K20 are not being replaced, but the K20X could be (unless pricing changes to accommodate all four cards).

NVIDIA Tesla K40

As the table below shows, K40 is based on GK110 (same as the current high-end GTX 700 series), but it’s tuned for computational needs, many of which I discussed in the above-linked article. Like AMD’s FirePro S10000 update announced last week (which flew under our radar until NVIDIA’s announcement), the K40 is equipped with 12GB of fast GDDR5, and since AMD hasn’t given us specifics, we’d imagine the 288 GB/s of memory bandwidth will put NVIDIA’s latest in the position of market-leader.

  AMD FirePro S10000 NVIDIA Tesla K20 NVIDIA Tesla K20X NVIDIA Tesla K40
Peak Double Precision FP 1.48 TFLOPs 1.17 TFLOPs 1.31 TFLOPs 1.43 TFLOPS
Peak Single Precision FP 5.91 TFLOPs 3.52 TFLOPs 3.95 TFLOPs 4.29 TFLOPS
Number of GPUs 2 x Tahiti LE 1 x GK100 1x GK110
Number of Cores 2 x 1792 2496 2688 2880
Clock Speed 825 MHz 705 MHz 732 MHz 745 MHz *
Memory Size Per Board 6 ~ 12 GB 5 GB 6 GB 12 GB
Memory Bandwidth 2 x 240 GB/s 208 GB/s 250 GB/s 288 GB/s
Power Consumption 250W 225W 375W 235W
System Servers Servers and Workstations Servers Servers
Pricing ~$3,000 (6GB) ~$3,000 ~$3,800 ~$???
* Turbo boostable to 810MHz or 875MHz.

Worth noting is that like NVIDIA’s recent desktop GPUs, Tesla K40 has a turbo feature that can boost its clock from 745MHz up to 810MHz or 875MHz, depending on user preference. Taking advantage of this boost will provide between a 10%~25% improvement in different scenarios, and as a whole, the K40 will be about 20~40% faster than the ultra-high-end K20X announced last year.

With NVIDIA’s recent announcement of CUDA 6′s unified announcement, and other features coming forth such as the GCC compiler adding support for OpenACC, things are getting quite exciting on the GPU acceleration front. My only problem? Not having use for it personally. It’s one of those things I want to dive into, but don’t have an explicit need! Someone hand me some big data to crunch, please.

