NVIDIA Announces Tesla K40 GPU Accelerator, 20~40% Faster Than K20X

Posted on November 18, 2013 10:00 AM by Rob Williams

We’re in the midst of supercomputing season at the moment, and that means one thing: Lots of announcements. As usual, NVIDIA has ownership to many of those, some of which we’re going to cover a bit later. To start off, I want to talk about what I’m sure most of you will want to hear about first: The new GPU accelerator, called Tesla K40.

You might recall that NVIDIA’s last big Tesla launch happened last November, where three new GK1x0 models were released. This time around, only a single model is being unveiled, so it’s clear that at least the K10 or K20 are not being replaced, but the K20X could be (unless pricing changes to accommodate all four cards).

NVIDIA Tesla K40

As the table below shows, K40 is based on GK110 (same as the current high-end GTX 700 series), but it’s tuned for computational needs, many of which I discussed in the above-linked article. Like AMD’s FirePro S10000 update announced last week (which flew under our radar until NVIDIA’s announcement), the K40 is equipped with 12GB of fast GDDR5, and since AMD hasn’t given us specifics, we’d imagine the 288 GB/s of memory bandwidth will put NVIDIA’s latest in the position of market-leader.

 AMD FirePro S10000NVIDIA Tesla K20NVIDIA Tesla K20XNVIDIA Tesla K40
Peak Double Precision FP1.48 TFLOPs1.17 TFLOPs1.31 TFLOPs1.43 TFLOPS
Peak Single Precision FP5.91 TFLOPs3.52 TFLOPs3.95 TFLOPs4.29 TFLOPS
Number of GPUs2 x Tahiti LE1 x GK1001x GK110
Number of Cores2 x 1792249626882880
Clock Speed825 MHz705 MHz732 MHz745 MHz *
Memory Size Per Board6 ~ 12 GB5 GB6 GB12 GB
Memory Bandwidth2 x 240 GB/s208 GB/s250 GB/s288 GB/s
Power Consumption250W225W375W235W
SystemServersServers and WorkstationsServersServers
Pricing~$3,000 (6GB)~$3,000~$3,800~$???
* Turbo boostable to 810MHz or 875MHz.

Worth noting is that like NVIDIA’s recent desktop GPUs, Tesla K40 has a turbo feature that can boost its clock from 745MHz up to 810MHz or 875MHz, depending on user preference. Taking advantage of this boost will provide between a 10%~25% improvement in different scenarios, and as a whole, the K40 will be about 20~40% faster than the ultra-high-end K20X announced last year.

With NVIDIA’s recent announcement of CUDA 6’s unified announcement, and other features coming forth such as the GCC compiler adding support for OpenACC, things are getting quite exciting on the GPU acceleration front. My only problem? Not having use for it personally. It’s one of those things I want to dive into, but don’t have an explicit need! Someone hand me some big data to crunch, please.

Recent Tech News
Recent Site Content