by Rob Williams on May 20, 2022 in Processors
First seen in its server-bound Milan-X EPYC, AMD’s brought its 3D V-Cache technology to consumers with the new Ryzen 7 5800XD. With triple the L3 cache vs. the original 5800X, the right workloads could exhibit a notable performance-boost. For our first look at the 5800X3D, we’re tackling our usual assortment of workstation performance scenarios.
If the past couple of pages prove one thing, it’s that we probably should have kicked off our 5800X3D coverage with either a gaming or server workload angle, because we’re just not seeing as much benefit from all of the extra cache as we were hoping to.
To wrap up our performance testing in this article, we’re taking advantage of synthetic benchmarks, as run through SiSoftware’s Sandra. Because synthetic tests tend to exercise the hardware provided better than standard workloads, we’re bound to see something interesting. Let’s start out with multi-media and arithmetic:
Multimedia & Arithmetic
We’re not off to the best of starts here. Neither the Arithmetic nor Multi-media tests take great advantage of the copious amounts of cache provided, so ultimately, a chip with a higher clock speed wins.
Image Processing
Considering the fact that we saw the 5800X3D leap ahead of the 5800X in our Adobe Lightroom test, we had a suspicion that the Sandra Image Processing test might yield similar findings, and sure enough, the 5800X3D does place ahead of the 5800X in the aggregate result.
The above results are a little too simple, however, because that aggregate result involves many individual tests which could reveal different levels of scaling. Here’s a breakdown:
|
5800X |
5800X3D |
5950X |
12900K |
Blur |
1,900 |
2,420 |
2,000 |
5,120 |
Sharpen |
1,210 |
1,140 |
1,770 |
2,140 |
Motion Blur |
568 |
549 |
986 |
1,000 |
Edge Detection |
982 |
939 |
1,560 |
1,780 |
Noise Reduction |
112 |
109 |
195 |
155 |
Oil Painting |
36 |
35 |
62 |
77 |
Diffusion |
2,000 |
2,000 |
1,830 |
4,210 |
Notes |
With these comparisons, we can begin to better understand why the 5800X3D just barely outperforms the 5800X. It’s only with the Blur workload that a real difference is seen between them. Interestingly, that extra cache did help propel the 5800X3D to deliver a much better result in the Blur workload over the 5950X, which has twice the number of cores and threads.
This is the first time we’ve run the CPU-bound Image Processing test, so we were rather impressed to see the results out of the top-end Intel 12th-gen chip. While the 5800X3D delivered a solid Blur result, the 12900K delivers one that’s more than two-times better. The strengths don’t stop there, either, with Sharpen and Diffusion showing major leaps in performance vs. the competition.
Cryptography
Pure speed and core count matters more than cache for these crypto workloads, so once again, the 5800X places just ahead of the slightly lower-clocked 5800X3D.
Memory Bandwidth
Memory bandwidth is hugely impacted by clock speed, which is the reason the 5900X places ahead of the 5950X, and the 5800X3D falls just behind the 5600X. A drop of 1GB/s on the memory bandwidth isn’t what we’d consider an issue, but it helps highlight the fact that sometimes, a lower-end part that happens to be clocked higher can deliver better results.
Cache Bandwidth
|
5800X |
5800X3D |
5950X |
12900K |
L1D |
1,930 |
1,830 |
3,150 |
2,760 |
L2 |
994 |
942 |
1,580 |
237 |
L3 |
416 |
516 |
645 |
131 |
Notes |
In the overall result, this is one of the rare times in this article where the 5800X3D finds itself ahead of the 5800X. In the breakdown table above, we can see why: despite the L1 and L2 results being less than 5800X, the L3 one propels it ahead. Ultimately, these numbers are difficult to take and translate into real-world performance, especially when you look at Intel’s seemingly paltry L2 and L3 results, especially when the reality is that the 12900K outperforms the 5950X in many tests.
Inter-Core Efficiency
|
5800X |
5800X3D |
5950X |
12900K |
Inter-Thread Latency |
20.5 |
22.4 |
41.9 |
40.2 |
Inter-Thread BW |
91.7 |
101.3 |
159.4 |
104.7 |
Per Thread BW |
5.7 |
6.3 |
5 |
4.4 |
1x 64bytes Blocks BW |
11 |
10.9 |
22.2 |
9.8 |
4x 64bytes Blocks BW |
14.8 |
15.5 |
28 |
20.4 |
4x 256bytes Blocks BW |
53.4 |
53.1 |
90.1 |
79.5 |
4x 1kB Blocks BW |
174.9 |
171.5 |
288.7 |
306.4 |
4x 4kB Blocks BW |
258.3 |
249.6 |
444.6 |
738 |
16x 4kB Blocks BW |
408 |
383.2 |
605 |
412.7 |
4x 64kB Blocks BW |
499.8 |
476.1 |
800.1 |
435.1 |
16x 64kB Blocks BW |
332.9 |
300.7 |
432.2 |
430.3 |
8x 256kB Blocks BW |
331.8 |
279 |
417.6 |
247.7 |
4x 1MB Blocks BW |
190.9 |
285.7 |
411.8 |
62.8 |
16x 1MB Blocks BW |
15.7 |
36.9 |
68.3 |
20.5 |
8x 4MB Blocks BW |
13.4 |
18.9 |
15.2 |
19.5 |
Notes |
As with the cache bandwidth results above, the numbers we see generated from the inter-core tests can’t easily be translated to real-world performance, but it is still interesting to see the differences between one chip to another among the same architecture.
When comparing the 5800X to 5800X3D, we see the former’s higher clock speed win out most often with smaller block sizes, and when the going gets tough, that’s when the 5800X3D steps up to the plate. This could be a good hint that if software was better designed to expect chips with lots of cache, we may see the 5800X3D place ahead of 5800X more often.