NVIDIA Announces Tesla K40 GPU Accelerator, 20~40% Faster Than K20X
Posted on November 18, 2013 10:00 AM by Rob Williams
We’re in the midst of supercomputing season at the moment, and that means one thing: Lots of announcements. As usual, NVIDIA has ownership to many of those, some of which we’re going to cover a bit later. To start off, I want to talk about what I’m sure most of you will want to hear about first: The new GPU accelerator, called Tesla K40.
You might recall that NVIDIA’s last big Tesla launch happened last November, where three new GK1x0 models were released. This time around, only a single model is being unveiled, so it’s clear that at least the K10 or K20 are not being replaced, but the K20X could be (unless pricing changes to accommodate all four cards).
As the table below shows, K40 is based on GK110 (same as the current high-end GTX 700 series), but it’s tuned for computational needs, many of which I discussed in the above-linked article. Like AMD’s FirePro S10000 update announced last week (which flew under our radar until NVIDIA’s announcement), the K40 is equipped with 12GB of fast GDDR5, and since AMD hasn’t given us specifics, we’d imagine the 288 GB/s of memory bandwidth will put NVIDIA’s latest in the position of market-leader.
AMD FirePro S10000
NVIDIA Tesla K20
NVIDIA Tesla K20X
NVIDIA Tesla K40
Peak Double Precision FP
Peak Single Precision FP
Number of GPUs
2 x Tahiti LE
1 x GK100
Number of Cores
2 x 1792
745 MHz *
Memory Size Per Board
6 ~ 12 GB
2 x 240 GB/s
Servers and Workstations
* Turbo boostable to 810MHz or 875MHz.
Worth noting is that like NVIDIA’s recent desktop GPUs, Tesla K40 has a turbo feature that can boost its clock from 745MHz up to 810MHz or 875MHz, depending on user preference. Taking advantage of this boost will provide between a 10%~25% improvement in different scenarios, and as a whole, the K40 will be about 20~40% faster than the ultra-high-end K20X announced last year.
With NVIDIA’s recent announcement of CUDA 6’s unified announcement, and other features coming forth such as the GCC compiler adding support for OpenACC, things are getting quite exciting on the GPU acceleration front. My only problem? Not having use for it personally. It’s one of those things I want to dive into, but don’t have an explicit need! Someone hand me some big data to crunch, please.