CPU-boundedness of the video system Part II – Effect of the CPU cache memory and memory speed
The effect of CPU cache memory size
As is evident from the heading, the next object under study will be the attempt to estimate the extent of CPU cache influence upon the performance of the platform in 3D applications. As a large number of tests shows, at 3D games the performance does not depend much on the CPU cache size. Today, we are ready to move from intuitive perception to the figures and present a quantitative estimation of the effect of CPU cache size on the performance in games.
The difficulty is in that we can't arbitrarily change the size of the CPU cache memory, so the only way to solve this problem is to compare two processors differing in only the cache memory size with other parameters unchanged.
Our "master" processor which we have used so far in preparing materials for this review is Athlon 64 4000+. This processor offers 128 K of L1 cache and 1024 K of L2 cache. For comparison we could take Athlon 64 3800+ whose L1 cache size is the same, and the L2 cache size is twice as less – 512 K, but we decided to go even further.
For comparison, we'll be examining the Sempron Socket 754 family of processors quite popular in the mainstream sector. The CPU Sempron which we use offers the rating 3400+, clock speed 2000 MHz, 128 K of L1 cache, plus 256 K L2 cache. That is the L2 cache size in the CPU Sempron is 4 times less than in Athlon64 4000+.
As regards the correctness of the experiment, processors of the Sempron family also support the technology for changing the multiplier towards reduction, so we'll be using the same methodology for building a graph of CPU-boundedness, as in the case with Athlon 64 4000+. The difference of Sempron s754 from Athlon 64 s939 is in the support for only the single-channel mode and reduced cache size of the processor. We'll put the produced "line of maximum possible results" on the same graph where we compared the performance of Athlon64 platforms with various RAM types. For the Sempron s754 platform, we used DDR400 Single Channel.
So, what can we see now? The results demonstrated by Sempron s754 DDR400 Single Channel almost reproduce the results produced for Atlon64 DDR200 Dual Channel. Surprisingly, but the fact is that the CPU Sempron s754 at 3D-games shows a decent performance and does not lag much from its elder brethren.
You might ask - well, that's fine, but why the cache size matters and how to estimate its effect of the performance of the platform? That's very simple - let's remove "all unnecessary" from the above graph and take a closer look. On the below graph, we left only two lines which meet the Sempron s754 and Athlon64 DDR400 Single Channel platforms. Note that for these platforms the speed and the operating mode of the memory are the same, and the difference is in only the size of cache memory of the processors.

As you can see, with the 4-fold difference in L2 cache size, Sempron shows results merely 10-12% worse than for Athlon64 running on the same clock speed. (Note – the results for Sempron start with 2000 MHz, since that is the maximum nominal speed for this processor, and overclocking would have resulted in the change of the operating system bus speed, memory speed, and therefore - in the distortion of results). The above graph also implies that for Athlon64 with L2 cache size 512 K the "line of possible maximum results" will take an interim position, that is, the difference as compared to Sempron will be even much smaller.
Therefore, for processors of the AMD K8 architecture the increase in cache size in 3D games produces an insignificant effect upon the overall performance of the platform.
But what will happen if we install a powerful enough video card sort of 7900GT on the Sempron platform and enable the 1280x1024 4AA/16AF mode? We set the screen resolution 1280ő1024 dots because it is the "native" resolution for most 17" and 19" LCD monitors, whereas for most CRT monitors of the same size the recommended resolution is 1024ő768 and 1280ő1024 dots, respectively. We made the graphics mode more demanding through activating the anisotropic filtering full-screen antialiasing in order to demonstrate the fact that a value processor is not the reason for giving up high-quality graphics.

As is seen from the graphs, both the "line of maximum results" and the curve of CPU-boundedness for the 1280ő1024 4AA/16AF mode in the case of using the Sempron processor lie below the respective lines for the CPU Athlon 64. Such behavior of lines is quite normal since the game is old enough and the video card used in the tests is powerful. Therefore, for both the CPUs at the specified clock speeds in the 1280ő1024 4AA/16AF mode we receive a transient area but not a "shelf". But even in view of this circumstance, it is seen that Sempron s754 at 1600 MHz (the clock speed of lower-end models in this family) are quite capable of showing a result as high as 70 FPS. Of course, together with Sempron models with L2 cache size 256 K there are products with L2-cache 128 K. But as was already shown above, the cache size of the CPU produces an insignificant effect upon the overall performance. Even if we deduct extra 10% from the results produced for the Sempron s754 platform depicted on the graph, the performance of even the lowest Sempron models of clock speeds as low as 1600 MHz would be sufficient to provide over 60 FPS!
Of course, both Half-Life 2 and DOOM 3 are rather old games. You may object that at modern games the Sempron is weak and its performance will "rest against" the power of the CPU. Let's verify that on the example of the game F.E.A.R. which is very demanding for the system resources.

As you can see, when we build «the line of maximum results», at the same CPU clock speed the performance of Sempron platform still lags a bit behind Athlon 64 (as it should be), but once we enable the quality mode the performance immediately rests against the video card!
As regards the Sempron SocketAM2 processors, we did not test these processors while preparing this article. But proceeding from the above, we can assume that the performance difference in 3D games for Athlon64 AM2 and Sempron AM2 processors will be even smaller since Sempron AM2 processors offer a dual-channel memory controller like Athlon64 AM2 processors. We have to admit that on the platform Socket AM2 the research into the effect of CPU cache size could have been conducted with less efforts. However, as you can see, we were able to do that in comparing both Socket 939 and Socket 754 platforms.
We should not think that we decided to confine to Socket 939 and Socket 754 platforms. The next in turn is the Socket AM2 platform. The results that we produced, albeit predicted in theory, are anyway impressive.
 |
Top Stories: |
 |
 |
 |
MoBo:


|  |
 |
 |
VGA Card:


|
 |
 |
 |
CPU & Memory:

|
|