3DNews Vendor Reference English Resource -
All you need to know about your products!
Biostar And ECS CPU Boundedness Foxconn 9800GTX
About Us | Advertise  
Digital-Daily.com
Digital-Daily

Motherboard
CPU & Memory
Video
Mobile
Cooling
Editorial
Digital
Links

Google
Web
www.digital-daily.com
www.3dnews.ru








Digital-Daily : Video : processor_dependency

CPU-boundedness of the video system Part I - Analysis

CPU-boundedness of the video system Part I - Analysis
Author: Dmitry Sofronov
Date: 26.07.2007

"Measurements" of CPU-boundedness

Test setup
Bus PCI-E
CPU AMD Athlon64 4000+
MB ASUS A8N-SLI Deluxe
Memory Kingston HyperX PC3200 2x512 MB
OS WinXP + SP2 + DirectX 9.0c
PSU Hiper 525W

Through varying the CPU multiplier we produced the following set of operating clock speeds of the CPU (in MHz) – 1000, 1200, 1400, 1600, 1800, 2000, 2200, 2400.

To start with, we produce results in the game HALF-Life2 at 1024õ768 in the «maximum details» mode but with the antialiasing and anisotropic filtering disabled. There is contradiction here. The «maximum details» settings are in charge of the image quality, and the disabled AA/AF allow producing FPS values which do "rest against" the CPU performance. We depict the produced results on the graph, with the Õ axis denoting the CPU clock speed and the Y axis reflecting the produced values of the video card performance in FPS.


Graph 1

Finally, we get a curve that strongly resembles a straight line. In fact, it is just the way it should be - if the video subsystem performance is not the limiting factor, then the results are proportional to the CPU clock speed. Let me explain why. Let's see how the computer draws image in the general case. For clearness, we bring in a drawing.


Figure 1

As you know, every 3D object is defined by a certain model made up of polygons, elementary geometric objects. In shaping every frame, the CPU calculates the number of objects, their space location, light sources, etc., that is a frame is shaped up in the "skeleton" representation (on the drawing, it is a kettle made of "wires"). Then, this "skeleton" along with the information on how it should be "painted" is passed over to the video adapter. Finally, once all the required textures, light sources, shadows have been applied to the skeleton, we get the resultant image which we can see on the monitor display.

That is, image is drawn in two main stages. The first stage - drawing the skeleton of the frame - is done by the CPU. The second stage - "painting the skeleton" - is done by the video adapter.

Therefore, when the video subsystem performance (the speed of "painting") is more than enough, the number of resultant frames per second is limited to the number of "skeletons" which the CPU is able processing, that is, proportionally to its performance. Certainly, the presented example is rather schematic and the pattern of load distribution between the CPU and the video card is more complex (that is why, in the general case, "the line of maximum results" does not have to be a straight line).

Now we can state what the physical sense of the line depicted on Graph 1 is. Its sense is that it means the maximum number of frames processed by the CPU at the given clock speed. Or, in other words - the upper boundary of results which can be achieved for the given application at the given CPU under set conditions of tests. That is, for each CPU clock speed the line shows the maximum results bar which we won't be able to get over whatever we do to build up the capacity of the video subsystem.

That is just what the diagram in the beginning of the article demonstrates. Of course, that diagram presents the results for the 4AA/16AF mode, but that makes no difference. The upper boundary ~146 FPS for the CPU clock speed = 2400 MHz remains the same also for the more powerful system based on Radeon X1900 CrossFire, as can be seen on this diagram.

Once again, let's look at Graph 1. You must have noticed that this graph is built not quite "correctly", and the CPU clock speed values start with not «0» but with 1000 MHz? Yes, we intentionally built the graph in just this way in order to estimate the straightness of resultant line easier. Now we re-draw the graph in a way that the CPU clock speed values start with «0» MHz, and also add results for the resolutions 1280õ1024, 1600õ1200, and three more lines for the same resolutions but in the 4AA/16AF mode.


Graph 2

Let's analyze the produced results. Evidently, increase of load upon the video subsystem (through raising the resolution and enabling the AA/AF modes) should result in the drop of FPS.

That is just what we can see on the graph. See how the character of the lines is varying. For the "easiest" of the modes presented here, the 1024õ768 NO AA/AF is depicted by almost a straight line. As more load is applied to the video subsystem, the lines of results smoothly «bend down» to the X axis in the right-hand part of the graph at high values of the CPU clock speed, but in the left-hand part they preserve the characteristic slope and almost merge into the slanted straight line (line 2). For the most "demanding" mode – the line of results becomes parallel to the X axis at high values of the clock speed (line 1). What does it all mean? With the insufficient CPU performance, the results practically don't depend of the "hardness" extent of the graphic mode and are thus bounded to only the CPU performance (the slanted line). With the insufficient performance of the video subsystem, the results stop being dependent on the CPU clock speed at some moment (the horizontal line on the graph). There is a very simple explanation of the fact - the video adapter processes only the number of frames which it is able "painting" although the CPU is able drawing much more "skeletons".

However, a few more very interesting and important conclusions can be inferred from the resultant graph. That's what we are up to just now.

Content:

  • Page 1 - Background
  • Page 2 - Problem statement
  • Page 3 - Research into the CPU-boundedness
  • Page 4 - A criterion for correct comparison of video cards performance
  • Page 5 - Endorsement of the theory. First practical results




  • Top Stories:
    MoBo:


    ECS X58B-A (Intel X58)
    ASUS Rampage II Extreme (Intel X58)
    MSI DKA790GX and ECS A780GM-A Ultra
    MSI P7NGM (NVIDIA GeForce 9300)
    Intel X58 and ASUS P6T Deluxe
    MSI P45 Neo2 (Intel P45)
    Foxconn A7GMX-K (AMD 780G)
    VGA Card:


    NVIDIA GeForce GTX 295 – a new leader in 3D graphics!
    ECS HYDRA GeForce 9800GTX+. Water-cooled and SLI "all-in-one"
    Radeon HD 4830 CrossFire - better than Radeon HD 4870!
    XFX GeForce GTX 260 Black Edition in the SLI mode
    Leadtek WinFast PX9500 GT DDR2 – better than GeForce 9500GT DDR-3
    Palit Radeon HD 4870 Sonic: exclusive, with unusual features
    Palit HD 4850 Sonic: almost Radeon HD 4870, priced as HD 4850
    CPU & Memory:

    GSkill high-capacity memory modules
    CPU Intel Core i7-920 (Bloomfield)
    DDR3 memory: late 2008
    CPU AMD Phenom X3 8750 (Toliman)
    AMD Phenom X4 9850 – a top-end CPU at affordable price
    CPU Intel Atom 230 (Diamondville)
    Chaintech Apogee GT DDR3 1600


      Management by AK
      Design VisualPharm.com

    Copyright © 2002-2010 3DNews.Ru All Rights Reserved.
    contact - info@digital-daily.com
    Digital-Daily - English-language version of the popular Russian web-project 3DNews