During an Intel press briefing Monday morning, Sr. Vice President and General Manager of the Digital Enterprises Group Pat Gelsinger discussed many products that will be discussed even further at the upcoming Intel Developer Forum in Beijing. Though most of the important information was revealed during this briefing, even more info may be unearthed at the IDF.
As the title of this article suggests, not too much was left untouched. Pat discussed products that affects almost all markets, minus mobile, including Dunnington, the six-core server-bound processor, Larrabee, the discrete graphics processor and also Nehalem, the chip that many who are reading this article are looking forward to. In this brief article, we’ll touch on all of these and also delve into a few of the technologies that are supported by these upcoming processors.
Nehalem – Seriously Scalable
We have been hearing so much about Nehalem since last spring, that it was great to finally learn more about it now. Given the fact that the new architecture is due to hit late this year, more information couldn’t have come at a better time. As we’ve known for a while, Nehalem is a brand-new architecture that will be completely modular, in that many different configurations can be built.
Nehalem will be pushing Quad-Core harder than ever, and the initial offerings look to offer that configuration exclusively. Included in each Nehalem processor will be L3 Cache, which will be shared among all of the cores, and also an IMC and QPI (QuickPath Interconnect). Once the Octal-Core chip is offered, it will include two QPI’s to improve speed between both processors – a process that should be faster than ever.
As it appears, the only limit to Nehalem is what could be fit into the size of the processor. Though it’s unlikely to be offered directly at launch, integrated graphics can also be added in, as long as there is enough die space to support it. The above figure is not an accurate representation of the die-size and module sizes, so even though the Quad-Core doesn’t look like it would support integrated graphics, it will. This is one of the biggest features of Nehalem, after all. However, “iGraphics” may not be available directly at launch.
Aside from what’s been mentioned so far, Nehalem will also introduce new uArch enhancements, such as increased parallelism, faster “unaligned” cache accesses, a second level TLB hierarchy and a second level branch predictor.
The increased parallelism was achieved by increasing the size of the out-of-order window to allow increased efficiency, while increasing the buffer sizes of the cores to assure that they would not become a bottleneck.
Multi-Threading is also making a comeback, but this time, it should prove more adequate than the previous generation. Each core will be able to execute two threads at once, enabling a total of eight on a Quad-Core and sixteen on an Octal-Core. With such processing, bottlenecks can occur easily and cause application lag, but thanks to other architecture improvements, such as much higher memory bandwidth and lower latencies, only improvements will be seen over previous generations.
Depending on the workload, performance increases of 20 – 30% could be achieved with the help of the effective multi-threading, although specific scenarios were not supplied. Although the power envelope increases with the multi-threading counterpart, the increases should outweigh the higher power draw, hopefully.
One of the largest benefits of Nehalem will be the integrated memory controller (IMC) which will support DDR3 exclusively. It’s unknown at this point if motherboard manufacturers will be able to opt-in to include DDR2 support on their boards, but it may not be entirely necessary. By the time Nehalem hits the market, DDR3 prices should have gone down substantially and should only continue to plummet as DDR3 adoption will be increasingly enforced, thanks in part to this launch.
On the desktop side of things, Nehalem desktops will be offered as both a single-socket and dual-socket configuration, with three memory channels per processor. That would essentially allow up to six DIMMs on a single processor and twelve in a dual processor configuration. Without a doubt, no one will need to go hungry for more memory.
Nehalem’s “Tock” counterpart, the 32nm Sandy Bridge, will be available in late 2009 or sometime during 2010.
QuickPath Interconnect – The Tech Formally Known As CSI
Beginning with Nehalem and Tukwila, the common Front-Side Bus will be replaced with QuickPath Interconnect, a feature built-into the processor that integrates the memory controller and connects the CPU/s with other components via a high-speed interconnect.
One of the main benefits of QPI is the fact that it’s integrated right onto the processor itself, and because the IMC is as well, it allows for much faster transactions. This is required due to the fact that with the improved performance on these chips, bottlenecks could occur, but are less likely to show face with this configuration.
In dual processor situations, each processor will have its own dedicated memory and caches, and because each CPU will include an IMC, memory bandwidth should be increased dramatically – Intel claims up to 4x what we currently see. If for some reason one processor needs to steal memory from the other processor, it can do so at very fast speeds through the QPI.
There are a few main points to take away from QPI. First is the fact that it’s much more efficient than the typical FSB, and given the technical aspects, the increases should be huge. With the IMC and QPI in the processor, the interconnect lanes will be incredibly fast, improving bandwidth all around, while reducing latency.
Without a doubt, QPI isn’t something that should be taken lightly. It should dramatically increase performance all around, and I cannot wait to test out the performance benefits first-hand. The QPI alone might be one of the biggest things about Nehalem to get excited over.