Methodology for testing video cards: 2007. Use of FRAPS
At testing performance of video cards, from time to time you have to update and complement the methodology of tests since new graphic processors, new games are released, and the methodology that used to meet all the requirements yesterday may not give adequate results tomorrow. Today, we are introducing our viewpoint at the methodology of testing games using the FRAPS utility. The urgency of the FRAPS utility is in that not all popular games have integrated tools to measure performance, and such information is indeed worth looking at. Our colleagues have already tried to use FRAPS to measure performance in games, but the conclusions were depressing – the scatter of readings proved too great from test to test.
However, despite all the problems with use of FRAPS, we already started using it for testing video cards in games like Need for Speed Most Wanted and Carbon, Elders Scrolls IV Oblivion, Titan Quest, and others, because we were able to develop a methodology for producing robust and quality results by means of this utility. How comes? Read on to find out.
FRAPS. A precision tool or a "telescopic hammer"?
Let's first recall what FRAPS is. That is a utility which is able measuring the speed of rendering scenes and calculating the minimum, maximum, and average FPS. In this investigation, we'll regard these features as the main purpose of the program. As regards producing screenshots, we are not going to dwell on this feature for now. So, FRAPS can measure FPS. Is that much or little? I dare assert that it is quite sufficient to regard FRAPS a precise and universal tool for measuring performance of video cards in games.
Why does there come up the need for utilities like FRAPS? Primarily, because many games have no integrated tools for measuring performance, and this information is indeed interesting, especially for popular games. But as we will see in what follows, even benchmarks integrated into games often show results which have nothing to do with what the user will see while playing the game. Is that a deception of buyers of games and video cards, or developers' bugs?
Now regarding the shortcomings of FRAPS. You must have read by now the materials devoted to use of this utility to measure FPS in games. It was admitted that despite all the universality, FRAPS is poorly suitable as a tool for tester. They normally bring in the following reasons.
Poor reproducibility of results
As a rule, this implies that results demonstrated between tests differ significantly. At first glance, it seems understandable. It is quite difficult to run along the same trajectory several times. And the outer environment may vary from start to start in a random manner and thus introduce errors into the measurements.
Error introduced by the starts/ends of measurements This cause of «instability» at measuring results is also of no surprise. Since you have to start/stop measurements with FRAPS manually, the distinct start/end of tests and stability of final results are out of the question.
Now let's think it over - are all these indeed related to just FRAPS utility? Unlikely. Note that all the listed arguments are not about measurement errors introduced by the utility, but errors caused by the method of measurements. In view of the fact that this method of measurement is the only possible in using FRAPS, the errors of the measurement method are often attributed to the errors of the utility itself, which is absolutely wrong. You might object - what to do then, the method of measurement is the only one and there is no other! Right you are. But even at these conditions it is possible to produce absolutely authentic results, at high precision and with much detail. It is true that FRAPS is not very convenient for tests, especially automated tests. But using it allows producing much more interesting results and without as much effort as may seem at first glance. In what follows, we'll tell you how to do that. But for now we need...
To verify the FRAPS results versus those produced with benchmarks integrated into games
We first verify the adequacy of results demonstrated by FRAPS versus applications that have integrated benchmarks. That is needed to make sure the results of integrated benchmarks and those produced by the utility coincide, and only then we can state that the results in games without integrated benchmarks will be equally adequate to the real performance.
For the experiments, we used our regular test configuration
We installed an ASUS 7950GT video card into this test setup. For tests, we used ForceWare 93.71 WHQL drivers.
We took the very well known game F.E.A.R. and ran the integrated performance test with FRAPS launched simultaneously. The testing mode – 1280õ960, Video Max, 4AA/16AF, Soft Shadows OFF. This is what we got in the end.
FRAPS produced the following results (the test was run 3 times):
| Test ¹
| average for three tests
The data was take from the « … minmaxavg.csv» file generated by this utility. As you can see, despite the «human element», the test results are very close to one another and fit quite well within the error limits. Surprisingly, by FRAPS shows even a bit higher results than the benchmark integrated into F.E.A.R., although a reverse situation would be more logical because the utility uses up the system resources and might decelerate it. Of course, using the same demo scene would facilitate the job of producing robust results. Of random errors, the error of FRAPS start/stop times was left. But, if the overall time of tests is long enough, this errors produces almost no effect upon the final results. Similarly, we also tested DOOM3, Half-Life 2, and a few other games, and produced similar results.
Conclusion – FRAPS produces quite adequate results which coincide (to some minor error) with the results produced with benchmarks integrated into the games.
But, strictly speaking, this coincidence should interpreted as a similarity in algorithms for FPS calculation executed by the integrated benchmark and FRAPS. Running ahead, we say that using FRAPS it is possible to produce more detailed results, and they will make a big difference from what the integrated benchmark shows.
Let's first take a look at the contents of the FRAPS results file ending with «… fps.csv». Below is a piece of the file in which there is only one data column - the FPS at each second starting the moment of launching the utility.
Averaging the produced values, we produce the customary average FPS. Hence, it is easy to determine the minimum and maximum FPS for the demo scene being tested. In fact, this data is just the basis for calculating values specified in the results file « … minmaxavg.csv». But this time the detailed data of FPS value at each particular instant is useful in that we can easily build a "FPS versus time" graph. This is what it looks like for tests in F.E.A.R. in our test conditions.
We complemented the resultant graph with a description of specific scenes available in the integrated F.E.A.R. demo.The bold blue line marks the average FPS value which the integrated test produces. As you can see, almost half the time is take n by the scene of shooting, and the FPS values in this scene are much lower than the resultant average and are somewhere below 40 FPS. At the same time, running along the corridors does not produce so much load upon the video card. Over many sections the FPS goes over 100 and in the end we produce the average FPS as equal to 59. This entails a very simple conclusion – the integrated F.E.A.R. demo is more biased towards the demo which is meant to show "niceties" of the graphic engine (water + fire) and thus produces a somewhat overstated FPS relative to the real gameplay (shooting). Of course, while playing the game for real you won't do without walking along corridors. But admit it, it is more important to get a high FPS at the moments of hostile attacks, whereas there is more than enough graphic power for corridors. That is, from the gamer's viewpoint, to produce the FPS adequate to the typical gameplay for the particular game, only the scene of shooting should be taken from the integrated benchmark. Then in our case the average FPS would equal 42.8. Compare it versus the average 59 produced by the integrated benchmark and feel the difference.
OK. Going on. A graph is not the only means to represent data produced with FRAPS. The next step in our investigation may seem somehow unnatural and redundant, but don't be hasty. That is a very important step towards intelligent use of FRAPS.
Let's build a diagram of FPS values distribution. That can be done very easily. Since the maximum FPS value in our test was 135, let's divide the whole range of FPS values into 5 FPS sectors and calculate the number of occurrences in each of the sectors. In the end, we produce the following diagram.
The physical meaning of this representation is simple - the height of the bar indicates how long (in seconds) the FPS value was within the given range. As you can see, the produced pattern of FPS distribution is highly non-uniform. Most of the FPS values are within the range from 30 to 70 FPS, whereas the remaining part makes the "tailings". To put in a common language, most part of the test time the FPS value fluctuates within 30 to 70 and it is relatively rare that it jumps above. The diagram depicts two vertical lines. The blue line stands for the average FPS value produced by the integrated benchmark, with the red standing for the average FPS value for the "shooting" scene.
This representation of test results is convenient in that unlike the graph of FPS it allows to produce quite a compact pattern of results distribution. If some test lasts long enough, then the graph will be extremely stretched or will merge into a single solid strip if left within the same dimensions. This diagram of distribution is free from such shortcoming, since the maximum FPS value is always upper-bounded and at the same time we can easily choose the ranges.
We'll tell you below what sense such data representation makes, but first we distract into a digression and recall some points of applied statistics. Let's once again take a look at the diagram of FPS distribution and ponder at which form of the diagram could be "ideal". We have a set of FPS values as the source data. How should they be distributed in the ideal case? Clearly, in reality we are unlikely to produce a thin vertical line on the diagram which precisely matches a specific FPS value. Situation in the games changes continuously - shots, explosions, enemies are running along the tier and machinate. Most likely, the FPS values will be within some range. As gamers, we want this range (that is, scatter of FPS) be not very wide, otherwise we will see lags under very low FPS values. Nor we need too high FPS values. Even professional gamers are no able to perceive more than 100 FPS, because 100, 200, or 500 FPS – makes no difference. Since the situation in a game scene may change in a random manner, it is reasonable to assume that the diagram of FPS distribution in the ideal should be symmetrical.
All the above conditions, despite their blurriness, fall within the Gaussian distribution, or normal distribution. It is by accident that the distribution is called that way, because it is widely available and applicable in many spheres of science and technology. For more details on that, read Vikipaedia.
The distribution itself looks like this:
On the figure, there are four graphs, for various parameters. In this case, ? is in charge of the displacement of the distribution graph along the Õ axis relative to the origin of coordinates, whereas ? is responsible for the peak width (range of FPS values as applied to our investigation).
You are disappointed, aren't you? The FPS distribution diagram that we produced looks clumsier and does not at all resemble the smooth Gaussian distribution line. In fact, there is nothing awful about that. Simply because we have too few FPS values, so the statistical methods work reluctantly with this minor set. Where shall we take more values? Increase the time of tests? There is a better method. Finally, we got round to the key point of our review.
||CPU & Memory: