by Rob Williams on March 15, 2021 in Processors
AMD has just launched its third generation EPYC server processor series, also known as Milan. This EPYC update brings AMD’s Zen 3 architecture to the data center, with its improved efficiency, faster performance, and bolstered security. With some of the new chips in-hand, we’re going to explore how AMD’s latest chips handle our most demanding workloads.
The previous pages have been focused on specific scenarios, like science, rendering, and compiling, so to help wrap up this initial EPYC third-gen performance look, this page is going to revolve around some more general scenarios, like compression and memory bandwidth.
We also have some more unusual tests on this page, including the largely synthetic NAS Parallel tests, as well as CoreMark, and even John the Ripper. Again, not all workloads are built alike, so it’s good to be thorough and see just how dominant one chip can be.
Yet again, we see an example above of how two similar workloads can behave quite differently. With 7-Zip, the core count matters more than anything, so it’s no surprise that AMD’s chips are glued to the top part of the chart. Interestingly, this gen’s 32-core clock-focused 75F3 performed the same as last-gen’s 64-core 7742, but the newer 64-core options, with their improved design over Zen 2, help propel them far beyond the rest.
With the Zstd compression test, clocks apparently matter quite a bit, as the 32-core 75F3 once again places on top, with both of the current-gen 64-core parts performing about the same (the 7713 wins against 7763 here, but when the performance level is so similar, they will flip-flop with subsequent runs.)
Chess Engine Performance
Chess engine performance isn’t exactly an important workload for most people who seek out server processors, but they still act as a perfect example of branching workloads if you take full advantage of all of the cores a CPU gives you. In the past, we’ve only tested with Stockfish, but we’re glad to have added Crafty this go-around, because you can see that once again, the 32-core 75F3 could make more sense in some cases over an even bigger option. It pays to know your workload.
It’s also worth noting that Crafty is kinder to Intel’s 28-core chips than most of the other tests. It really does prove that not one chip can be great at absolutely everything.
Here’s a chart that AMD’s competition can’t dig too much. AMD’s eight-channel memory controller is powerful, delivering an immense amount of bandwidth at the top-end compared to that Xeon competition. There’s such a stark divide here, that you’d imagine Intel’s controller is only quad-channel – but it’s actually six-channel. You can also see the performance uplift from AMD’s own previous generation CPU.
We regret not being able to include Intel’s latest-gen Xeon in here, as the memory support has been bumped to 3200MHz, but that still wouldn’t be enough to bring it that much closer to EPYC. For those with seriously heavy memory bandwidth needs, it’s hard to ignore data like this.
Note that EPYC 7003 CPUs with less than 128MB of cache can use four-channel memory, while all of the SKUs can take advantage of either six- or eight-channel memory.
Not even all synthetic benchmarks are built alike. The NAS Parallel (from NASA) suite of tests are in fact synthetic in nature, but are meant to show where one CPU will excel over another. With the Embarrassingly Parallel tests, we see pretty much expected scaling. The new-gen 7713 outperforms the last-gen 7742 ever-so-slightly. In the LU.C test, the higher clocks of the 75F3 helps give it a boost over the 7713 and 7742.
Both CoreMark and John the Ripper perform according to our expectations, aside from the fact that the 7713 came delivered notably more performance than the last-gen 7742 in JTR. It’s clear over and over throughout these results that AMD’s Zen 3 architecture has benefited a number of workloads to an obvious degree, which is great to see.
As some of our system sensors were not working correctly when we tested the latest EPYCs on AMD’s Daytona platform, we’re only able to include the three new EPYC chips that we’ve tested here via different power testing methods. Our stress test involves a Blender project with an obscene number of iterations, so that any CPU can be taxed for a long time.
The 32-core and 64-core parts use about the same amount of power, which is to be expected, as AMD itself gave both the same top-end TDP. What’s interesting to us, however, is that the 64-core 7713 used significantly less power than the 7763, more so than what the printed TDP differences suggest. Really – 637W for 256 threads is quite impressive. Don’t fret, though: the server will still be just as loud as you’d expect it to be!