We’re in the midst of supercomputing season at the moment, and that means one thing: Lots of announcements. As usual, NVIDIA has ownership to many of those, some of which we’re going to cover a bit later. To start off, I want to talk about what I’m sure most of you will want to hear about first: The new GPU accelerator, called Tesla K40.
You might recall that NVIDIA’s last big Tesla launch happened last November, where three new GK1x0 models were released. This time around, only a single model is being unveiled, so it’s clear that at least the K10 or K20 are not being replaced, but the K20X could be (unless pricing changes to accommodate all four cards).
As the table below shows, K40 is based on GK110 (same as the current high-end GTX 700 series), but it’s tuned for computational needs, many of which I discussed in the above-linked article. Like AMD’s FirePro S10000 update announced last week (which flew under our radar until NVIDIA’s announcement), the K40 is equipped with 12GB of fast GDDR5, and since AMD hasn’t given us specifics, we’d imagine the 288 GB/s of memory bandwidth will put NVIDIA’s latest in the position of market-leader.
| ||AMD FirePro S10000||NVIDIA Tesla K20||NVIDIA Tesla K20X||NVIDIA Tesla K40|
|Peak Double Precision FP||1.48 TFLOPs||1.17 TFLOPs||1.31 TFLOPs||1.43 TFLOPS|
|Peak Single Precision FP||5.91 TFLOPs||3.52 TFLOPs||3.95 TFLOPs||4.29 TFLOPS|
|Number of GPUs||2 x Tahiti LE||1 x GK100||1x GK110|
|Number of Cores||2 x 1792||2496||2688||2880|
|Clock Speed||825 MHz||705 MHz||732 MHz||745 MHz *|
|Memory Size Per Board||6 ~ 12 GB||5 GB||6 GB||12 GB|
|Memory Bandwidth||2 x 240 GB/s||208 GB/s||250 GB/s||288 GB/s|
|System||Servers||Servers and Workstations||Servers||Servers|
Worth noting is that like NVIDIA’s recent desktop GPUs, Tesla K40 has a turbo feature that can boost its clock from 745MHz up to 810MHz or 875MHz, depending on user preference. Taking advantage of this boost will provide between a 10%~25% improvement in different scenarios, and as a whole, the K40 will be about 20~40% faster than the ultra-high-end K20X announced last year.
With NVIDIA’s recent announcement of CUDA 6’s unified announcement, and other features coming forth such as the GCC compiler adding support for OpenACC, things are getting quite exciting on the GPU acceleration front. My only problem? Not having use for it personally. It’s one of those things I want to dive into, but don’t have an explicit need! Someone hand me some big data to crunch, please.