by Rob Williams on March 28, 2007 in Processors
Intel today unveiled new information regarding their upcoming Penryn and Nehalem lineup. We are taking a look at what was discussed, including ballpark performance increase figures, benefits of the SSE4 instruction set and additional facts about their integrated memory controller and graphics processor.
Since the beginning of the year, much information has been unveiled about Penryn and Nehalem, including the fact that they will both utilize the new Hi-K MG transistors for better efficiency. We also know that production of Penryn will begin in the second half of this year in two of their fabs, with four labs churning out the goods by the second half of next year.
Today, Intel went into more depth about what we should be excited about. Smaller processes are nice, but there is a lot more going on behind the scenes than a simple scale-down to 45nm. Penryn and Nehalem build upon what makes the Core architecture so great, but we will be seeing far more improvement than what’s obvious. One point that was stressed constantly was that while Penryn is a derivative of the Core architecture, Nehalem is built from the ground up. Intel notes that Nehalem is the biggest architecture change since the Pentium Pro, which was first introduced in late 1995.
The High-K Metal Gate transistors are an upgrade worth mentioning, but I won’t go into much detail today since all the information you need to know has been available since January. Essentially Hi-K MG transistors allow for far more efficiency, faster switching speeds and increased instructions per clock. Lower current leakage is another large benefit that should in turn lower power consumption and perhaps TDP.
Many people have questioned Intels Core architecture, claiming that it could be a one-hit wonder. Intel believes that once you see Penryn and Nehalem in action, you will become a true believer.
Penryn
Penryn is the architecture we will see by the end of the year, with production in the late summer. It will represent the entire market including servers, desktops and mobile computers. We should be seeing the official launch of the server CPUs before the end of the year, but will not be able to get our hands on desktop chips until sometime early 2008. It also appears that 45nm Extreme Editions will be Quad-Core only, as the roadmap didn’t note any Dual-Core EE models.
As I mentioned, Penryn is a derivative of the Core architecture, but with many improvements. Since we are making the move from 65nm to 45nm, it gives greater flexibility while retaining the same thermal envelope. That’s to say that even though Penryn CPUs might still be 65W, the various improvements gives it a larger performance efficiency/TDP ratio. While the current top-end Dual-Core CPU sits at 2.93GHz for example, Penryn might allow a higher clock and FSB speed while retaining the same TDP. Intel would not touch on possible frequency ratings.
These new microarchitectures also bring SSE4 to the table, a set that will bring over 50 new instructions. It was first believed that Nehalem would be the only one to benefit from SSE4, but the information displayed today confirms that we will be seeing it in Penryn as well. SSE4 is not only beneficial for media buffs and code monkeys, but it also has potential gaming benefits.
Super Shuffle Engine is another new term being tossed around, which is the process of properly aligning your data in a compiler. This is another process that promises to cut various processing time in half, or better. One benefit is that to take advantage of this technology, no software changes are required. True benefits will be discovered once we have a CPU in our hands, but Intel is very confident that the potential improvements are worth getting excited over.
With Penryn is a new power state, tentatively called “Deep Down Power State”. There are five different C-states, including this new one, with C0 being the computer at normal state. C1, C3 and C4 all trigger various portions of the CPU in order to use less power while in that state. Deep Down comes closest to C4 which turns off both the Core Clock, PLL and flushes the caches. But instead, Deep Down turns off the caches entirely, which gives the ability to lower the core voltage even lower. When compared to C4, Deep Down (C6) uses about 300% less power.
The downside is that it will take longer to come out of C6. C1 and C3 are rather mild, so coming out of them won’t take more than a few seconds, while C4 takes a few seconds longer. C6 will take twice the time, so that’s the price you will pay, for having to not pay a higher price for the energy bill.
Another new power related feature is Enhanced Dynamic Acceleration Technology. Essentially what this can do is boost the frequency of one core for a single threaded application if the other core is unused. During the process, the TDP remains the same, but the performance for that single threaded application should increase. Think of this as a Turbo mode without the need to push a button on your beige tower.
Penryn will also include a faster Radix-16 Divider. Previously, a Radix-2 or Radix-4 divider was used, but Radix-16 delivers far faster Floating-Point and Integer operations. One example given was a square root calculation, which was calculated in 33% of the time it would take for Radix-2, 4. Square root is not the only beneficial area, but it’s the one with the greatest improvement.
New features aside, both Penryn and Nehalem offer better numbers. Dual Cores will boast 6MB L2 cache, compared to 4MB today and Quad-Cores are bumped to 12MB, from the 8MB currently. The front size bus has also been pumped up to 1600MHz. Although Intel didn’t want to state that new CPUs might go beyond 3.0GHz, they are likely to with such a FSB, unless the multipliers are reduced substantially. Die sizes of course are decreased as well. While Core CPUs had a die size of 143mm^2, Penryn will be 107mm^2.
Let’s take a quick look at what Nehalem will bring to the table.