| 3DNews Vendor Reference English Resource - All you need to know about your products! |
||||||
![]() |
||||||
|
|
||||||
ForceWare 52.16: NVIDIA's retaliationDate: 14/12/2003
By: Aleksey Burdyko
IntroductionIt's no longer a secret that the pixel and shader performance in NVIDIA's products starting with NV30 has been at an inadmissibly low level, if compared to the performance of ATI's similar products within the same class. While in the synthetic benchmarks like 3DMark 2003 the problems were "solved" quite well with the advent of new drivers and embedded "optimizations", which often deteriorated the image quality or used candidly fraudulent techniques to push up the performance, then with the release of new benchmarks and, most importantly, DirectX 9.0 games that supported pixel and shader programs of version 2.0, the situation for NVIDIA chips was increasingly getting lamentable. We'll be also analyzing the situation with 3DMark 2003 in our today's testing session. It's especially interesting to see the test results for the package in view of the newly released (v340) patch thereto. Let's recall those new games which have been released so far, and then the testing sessions conducted by numerous editions which revealed a hard to imagine lapse in scenes where pixel and vertex shaders of version 2.0 were intensely used. "Tomb Rider: Angle of Darkness", "Halo: Combat Evolved", "AquaMark 3", a beta version of "Half-life 2" which "escaped" online just in good time - at all these games and benchmarks built on real gaming engines, NVIDIA boards, softly speaking, were not superior at first glance and even flagship models produced results comparable to ATI's middle-end boards (what is meant it the testing of the Half-Life 2 beta version). That's why a priori NVIDIA boards were rightfully dubbed "slow cards" at DirectX 9.0. We'll try to find out the cause of such results issued by NVIDIA cards in the theoretical section of the review. Of course, NVIDIA is not sitting idle, and the programmers of this Californian company have released new ForceWare drivers which should eliminate the reproaches regarding the low shader performance in the NVIDIA NV3x family (by the way, 9 months ago, at the past CeBit, Alan Tike claimed the name Detonator would live forever). In the new driver, along with the traditional bug fixes and addition of new features, the compilers of vertex and pixel shaders have been significantly revised, which should increase the rendering speed and thus should not affect the image quality. It's just the new algorithm for handling the pixel and vertex programs that determined the change of NVIDIA drivers name. So you can easily forget about the Detonator, so often criticized recently =). The new drivers are dubbed ForceWare, and currently the only WHQL-certified version available is 52.16. In the theoretical part of our review, we'll also consider what NVIDIA programmers actually revised in terms of the driver operation, which required a change in the name. Although there is a certain marketing trick of the company, but as practice tells, NVIDIA has every reason for that.
|
![]() |
![]() |
![]() |
At first glance, they are not so noticeable. The design of the menus hasn't undergone much changes since the times of the 40th series. There isn't any conceptually new approach in building the menus, which is not needed in fact, because the author of the review is absolutely content with the 40th series way of the menu layout and GUI.
![]() |
![]() |
![]() |
The antialiasing and anisotropic filtering settings are gathered in the same section and offer 3 quality regimes:
![]() |
![]() |
![]() |
The standard sections haven't been amended.
![]() |
![]() |
![]() |
Now you can choose the desired resolution, but no just select ready pre-sets. This option is indeed a real value.
![]() |
![]() |
![]() |
But the nView section does offer quite substantial changes. The major innovation of nView 3.0 is about the use of the "gridlines" feature which allows partitioning the screen more effectively and conveniently into several independent zones. If you are a lucky owner of a Quadro professional video card, the number of such zones can be up to 9, while GeForce family cards offered merely 4, which in our view is more than enough.
Also, official support for the most recent chips GeForce 5700, GeForce FX 5700 Ultra and GeForce FX 5950 Ultra has been introduced.
![]() |
![]() |
Of course, we were interested in other innovations of the driver. Namely, the revised "unified" compiler of DX 9.0 code. The idea behind it is that the compiler while receiving instructions in the form of a simple DirectX 9.0 code interprets them for the chip, re-builds the order and structure of commands in real time in order to provide the GeForce FX chip with the re-worked code which would execute faster than if the commands are fed to the graphics chip "as is". Visually, this is illustrated in the following diagram:

Potentially, the compiler can reduce the number of passes required by the code that comes directly out of the API. In the end, this may positively affect the accelerator performance in handling pixel and shader programs. Note also that the image quality does not suffer from that since the optimizations do not affect the floating-point precision settings - they simply rebuild the order and structure of commands, which logically can't deteriorate the image quality because the requested shader is displayed anyway. Another thing is that the accelerator will always handle the code more amenable for the architecture of FX chips.
Don't think that such an idea of optimization hasn't come into the heads of programmers at NVIDIA. The fundamentals of a "unified compiler" were laid down still in Detonator 44.12, but the idea was not brought to perfection, so the polishing and finishing of a real working technology was left to further drivers of the ForceWare series.
Optimization of the code that enters the GPU is no doubt a good idea. NVIDIA programmers deserve a real credit for that, but the matter of floating-point precision is still open. Since the most recent version of the Microsoft High Level Shader Language allows the programmers to choose a floating-point precision in writing the code, NVIDIA positions the possibility of GeForce FX chip architecture to choose one of three floating-point precision modes (the already mentioned 32-bit precision, 16-bit and the 12-bit integer mode) as an advantage of its chips. This is indeed difficult to negate: why should the 32-bit or 24-bit precision (for the case of ATI boards) be always used, if it is possible to restrict to e.g. the 16-bit floating-point precision for some specific tasks that do not require increased precision. Another thing is that it's not always possible to choose the right precision for particular tasks. In this case, the programmers are expected to spend much more effort in writing and optimizing the code.
The colorful package, a de facto standard for ASUS FX product line, included the following:


Once again we see a marvelous package bundle by ASUS, for which the company has always been notable for.
The board's PCB design is completely identical to NVIDIA's reference board. There aren't any differences in either the component layout nor even in the positioning of capacitors.


The board itself proved to be quite massive. This is most likely due to the heavy cooling system (read below, for more details) with copper radiators.
The video card offers a classical dark green PCB and 128 MB DDR onboard with a 256-bit data transmission bus (8 chips, 32 bit each, positioned over the front side of the PCB). As we see, the board offers twice as less of memory capacity than its elder sister GeForce FX 5900 Ultra, which is achieved through under-equipment of the card with 8 memory chips so the contact pads on the board's reverse side remain vacant. The video card offers the AGP 2x/4x/8x interface and a standard set of outputs: one DVI-I, one analogous, and one TV-OUT. The signal for digital monitors is formed by the Sil164CT64 TMDS-transmitter made by Silicon Image.
There is also a contact pad for a VIVO chip which is not installed (available on the Ultra version of the board). On the front side of the PCB, there is also a connector for additional power necessary for the resource-hungry GeForce FX 5900. You don't have to apply the additional power to the board, but in this case the card will run at reduced frequencies (250 MHz core, 500 MHz memory). The shortcoming of the additional power connector on ASUS V9950 is its vertical positioning. First, it's quite difficult to apply power with an AGP video card already in place. Secondly, the fastening of the connector leaves much to be desired.

For memory chips positioned only on the front side, there is an advanced BGA packaging. The access time of memory chips is 2.2 ns, which is equivalent to 454 MHz (908 MHz), but the memory runs at 425 MHz (850 MHz) as per NVIDIA's specifications. The GPU operating speed is 400 MHz, which also meets the frequency recommended by NVIDIA.
The cooling of ASUS V9950 is made quite well, and during the tests there was nothing to complain about the overheating. In its overclocked state, the card was running properly despite the outstanding overclocking settings. The cooling system appears to be a continuous structure made up of a quite massive copper radiator that covers both the chip itself and the memory chips (for which there are some hollows, which improves the tightness of their attachment to the radiator), two fans (whose blades gleam in the ultraviolet, so ASUS V9950 is a real nicety for a modding fancier =) ) that blow around both the graphics chip and the memory chips.
On top of it all, I'd like to add that despite its massive bulk and seeming awkwardness of the cooling system the adjoining PCI slot is not blocked; anyway, it's better not to torture any device through installing it into the first slot because that will be a real trial for video card either since the air flow would be restricted. Among the advantages of this cooling system is the very low noise level (let alone the Flow FX =) ) which is very hard to distinguish behind the noise coming from the processor cooler and the hard disk.
The box with an impressive and smart drawing included the following:
What we can note is that ATI's traditional (good old) partners are turning over a new leaf and start accompanying their products with really rich package bundles.
The board's design is a clone of ATI's reference board. The PCB is designed as per ATI's requirements and no differences are seen.


The board offers ATI's traditional bright red color of the PCB, has 128 MB DDR memory onboard, the AGP 2x/4x/8x interface and a standard set of outputs: one analogous, one digital, and one S-Video. The good old two-phase SC1175CSW of Semtech is used as a voltage regulator.
The video card is equipped with 128 MB DDR memory packaged in 8 chips (4 chips on each of the sides - front and rear) within the advanced BGA packaging, with the 256-bit memory bus. The memory is produced by Hynix (HYB25D128323C-3.0), offers a 3.0 ns access time, which is equivalent to approximately 333 MHz of memory operation (666 MHz), but the memory runs at its intended frequency 290 MHz (580 MHz). That is, there is a small overclocking margin for the memory. The graphics chip also runs at 325 MHz as per the specifications.

There is absolutely no cooling for the memory chips. To cool the graphics processor, a low-profile cooling system is used which hardly can be regarded as effective enough. A standard reference small fan is fitted on the radiators. Nevertheless, in the nominal mode during the long 3D testing session there were no stability problems found. At the same time, the radiators were heated up quite immensely.
Test configuration:
| Motherboard: | JetWay S446 (SiS 645) |
| Processor: | P4 Northwood 1.6A@2.13A Ghz (133x16) |
| Memory: | 256 MB Hynix PC2100 DDR SDRAM (CL=2) |
| HDD: | Maxtor Diamond Plus 8 40 Gb |
| Video cards: | ASUS V9950 128 Mb (NVIDIA GeForce FX 5900) Sapphire Atlantis Radeon 9800 128 Mb (ATI Radeon 9800) |
| OS | Microsoft Windows XP SP1 ENG, DirectX 9.0b |
| Driver: | Detonator 45.23 WHQL and ForceWare 52.16 Catalyst 3.9 |
We remove all the decorative "niceties" out of the Windows GUI and set the operating system to maximum performance.
Disable the Vsync forcedly via the drivers both in OpenGL and in Direct3D applications. The S3TC texture compression was also disabled.
Test software:
Since the time of our previous test, we have radically revised the composition of our test synthetic packages. We have given up using the already outdated DirectX 8.1 package MadOnion 3DMark2001SE, and instead of it to assess the operation speed of DirectX 8.1 shader programs (shaders of versions 1.1 and 1.4) we left the already customary Codecreatures benchmark. The focus in selecting the DirectX 9.0 synthetic benchmarks was made on synthetic programs, so we have got 2 new kids:
• ShaderMark v2.0 (DirectX 9 HLSL, a benchmark for pixel shaders);
• D3D RightMark 1.0.2.7. (Public Beta 1) (a comprehensive DirectX 9.0 synthetic benchmark).
You can read a detailed analysis of data produced for the benchmarking packages directly in the review as we proceed with the tests of video cards.
All the video cards were run in the benchmark in the "Anti-Detect Mode". Note also that NVIDIA's GeForce FX 5900 was unable to pass all the tests in this mode, of which the benchmark honestly reported. On the other hand, with ATI Radeon 9800 board there were no problems - all the whatever shader versions offered by ShaderMark v2.0 started up on that ATI's board without issues.

So what can be said regarding the results? Here we can observe that the NVIDIA chip was literally crushed by ATI Radeon 9800. The graphs which point to ATI's 2-3-fold leadership over NVIDIA's chip are self-explanatory - none of the shaders (!) offered by the program was executed on NVIDIA GeForce FX 5900 faster than on ATI Radeon 9800. That's what the pure HLSV means for NVIDIA chips. To NVIDIA's credit it's worth noting that the new NVIDIA ForceWare 52.16 driver shows increased performance as compared to its predecessor Detonator 45.23, but it is scanty and does not affect the alignment of forces if we look at the results produced by ATI Radeon 9800. But what we need is something different: it is indicative that ForceWare 52.16 does offer a performance boost in HLSV, namely in the HLSV code. This suggests that the optimizations applied in the new NVIDIA's driver work in the code for real and give results. Albeit not so significant as we would want, but nevertheless they are there.
This is also a new benchmark in our set of synthetic applications, which in our view allow to assess the performance of the accelerator's subsystem in a most effective and impartial way, which matters. All the tests were conducted under the 1024x768 resolution. We didn't carry out all the tests under all possible settings - such a huge number of tests is unlikely to give our readers more information. It's more likely to confuse in the heap of diagrams =).
Geometry Processing Speed

This test allows to assess the speed at which the geometry is processed by the accelerator. We used the most advanced mode with three diffuse-specular light sources in combination with three different operating modes: the traditional TCL (Fixed-Function Pipeline), vertex shaders 1.1 and pixel shaders 1.1, vertex shaders 2.0 and pixel shaders 2.0.
As we see, in the case of the traditional TCL the NVIDIA card leaves ATI Radeon 9800 well behind. What is remarkable is that a substantial gain due to using the new ForceWare 52.16 driver has been achieved - evidently, the shader responsible for the TCL emulation was optimized. But things turn really sad if shaders of version 1.1 and 2.0 are used. The performance of NVIDIA chip drops sharply, and nothing of the gain produced by the new driver is seen in the case of shaders 2.0 at all. On the other hand, ATI's card keeps a stiff upper lip and demonstrates an identical performance in using pixel and shader programs of both versions 1.1 and 2.0.
Pixel Filling
This test performs a number of various tasks, but we were mostly interested in the possibility of measuring the performance of frame buffer filling.
As we see, the performance is higher in ATI's chip. The new NVIDIA's ForceWare driver fixes the situation and does that quite substantially, but it anyway fails to catch up with the level attained by ATI Radeon 9800
Pixel Shading
This test in the D3D RightMark benchmarking package allows to estimate the performance of executing various pixel shaders of the second version. In this test, the geometry has been substantially simplified to minimize the dependence of results of the test on the geometric performance of the chip and verify the operation of pixel pipelines only.
As we see, the ATI Radeon 9800 chips beats NVIDIA GeForce FX 5900, and at three-fold benefit. The re-worked compiler of NVIDIA's new driver gives a gain, but it anyway does not allow the NVIDIA chip to come closer to Radeon 9800. Shaders written with HLSV are really a hard nut to crack for the NVIDIA chip - this seems to be an axiom which seems to stay imperturbable.
Point Sprites
This test is aimed at revealing the accelerator speed at displaying point sprites. In the test settings, we used 2 diffuse light sources.
ATI chip takes a lead again, although the alignment of forces is closer to the situation in the filling and geometric performance tests than in the tests of pixel shader 2.0 performance, which is logical since the test depends directly on these two parameters.
Hidden Surface Removal
This test allows to estimate the efficiency of removal of hidden points and primitives by the accelerator.

Cutting off hidden points works faster with ATI Radeon 9800 chip than with NVIDIA GeForce FX 5900 and quite substantially, which should affect the real-world applications.
3DMark2001SE benchmark is already old enough, but it's there in all our roundup as an honorable veteran =). Moreover, DirectX 8.1 games are very popular these days, which allows the user partly target the results of this benchmark for estimating the potential performance of boards in modern gaming applications. .

As regards directly the test results, the winner in the long run is the Sapphire board built on the ATI Radeon 9800 chip at all the resolutions. We stopped using detailed results of this test in favor of the above results of other synthetic benchmarks which, in our opinion, give a more impartial view of the real performance level for the boards.


At version 330 of 3DMark 2003, NVIDIA GeForce FX 5900 beats ATI Radeon 9800 in all the resolutions. A bit strange though, considering that NVIDIA failed all the shader tests of other synthetic benchmarks. But they at NVIDIA can make the "right" results, can't they? =)
At that, indicative are results produced with the latest patch to FutureMark version 340 which along with the company's new concept of producing and interpreting test results should revive the cracked (softly speaking =) ) credibility of 3DMark 2003. Why "should"? We'll explain that in what follows.
Install the new patch and run the tests: ATI Radeon 9800 is a leader with practically the same scoring of the 2003 version, and NVIDIA GeForce FX 5900 with ForceWare 52.16 is already a loser (the data for Detonator version 45.23, unfortunately, were not produced due to technical reasons, and there wasn't much sense in getting them, actually) who lost many score points as compared to the version 330 patch. A very amusing, but a regular situation. Soon after that, an unofficial release of the ForceWare drivers goes online, which "fixes" just the persistency of 3DMark 2003 in showing the right result with the new FutureMark patch. This driver hasn't yet acquired the "officially approved" status from FutureMark, but we are more than confident that WHQL0certification is at hand, followed by the "approval".


In this quite outdated Codecreatures benchmark, NVIDIA beats ATI Radeon 9800 at all the resolutions. Almost no performance boost achieved through using the new NVIDIA's ForceWare 52.16 driver was noticed.
From synthetic applications, we are now moving on to analyzing the performance of the graphic boards in real gaming applications. Like in the part of the review dealing with synthetic tests, the set of benchmarking applications has undergone some changes. First off, this applies to the safely "leaked online" beta-version of Half-Life 2 whose test results, in our opinion, would be very interesting for our readers. We have also added a nice benchmark based on the engine used in the soon released Final Fantasy XI game.

To start with, it is curious to look at the results produced from applications that do not make active use of shaders, among them is just Unreal Tournament 2003. At low resolution with Detonator 45.23, the Asus board built on the NVIDIA GeForce FX 5900 chip lags behind Sapphire board built on the ATI Radeon 9800. But installation of the NVIDIA ForceWare 52.16 driver brings its benefits, and the boards are already on par. At higher resolutions, NVIDIA's board holds leadership at both versions of the drivers. The performance boost attained due to the change of NVIDIA driver at high resolution is much lower than in the low resolution.

In this benchmark, even without installation of a new driver, NVIDIA GeForce FX 5900 traditionally beats ATI's counterpart. The thing is the benchmark uses a huge number of stenciled shadows which on NVIDIA boards require fewer passes and DO NOT use pixel and vertex programs, which creates an ideal environment for NVIDIA chips. The performance boost achieved due to the installation of a driver with the revised compiler is quite noticeable and increases the gap between the ATI counterpart.

This fresher revised version of the Unreal Tournament 2003 engine used in the game Unreal II: The Awakening demonstrates the leadership of ATI's board. This is most likely caused by the more complex geometry in the game. The new driver does not fix the situation, but noticeably reduces the gap at least in the most "playable" resolution to date. At higher resolution, ATI's board built on Radeon 9800 is unattainable for NVIDIA GPU's.

This is a new test in our set of benchmarks. As far as we know, the gaming engine uses neither pixel nor vertex programs of whatever versions. But since there is no officially approved information regarding that, our doubts and surmises will remain unsolved, and we'll go on worrying about the inability to find out the explanation of why ATI Radeon 9800 board beat NVIDIA GeForce FX 5900 =), while ForceWare 52.16 added much more score points in the test to ASUS V9950 board.

This is a very indicative shader benchmark to date, which gave food to loose talk regarding the optimizations both from NVIDIA and (!) ATI.
The results themselves create quite a dramatic situation for the ATI board. With Detonator 45.23, ASUS V9950 based on NVIDIA GeForce FX 5900 loses to the Canadian counterpart, but installation of the most recent NVIDIA ForceWare 52.16 drivers gives NVIDIA's offspring a substantial performance boost which is more than enough to leave ATI well behind.
We are also presenting for your judgment some screenshots with detailed results of this test so that to analyze the results for each of the video cards tested.
![]() |
![]() |
![]() |
| FX 5900 - Detonator 45.23 | Radeon 9800 - Catalyst 3.9 | FX 5900 - ForceWare 52.16 |
It's interesting to note that one of the most demanding tests in the "Large Scale Vegetation Rendering" package practically didn't respond to the change of the driver, nor did the "Massive Overdraw" test. The following tests proved to be the most sensitive to the change of the driver with the compiler code revised: "Dynamic Occlusion Culling", "Masked Environment Mapping" and "Large Scale Terrain Rendering" which showed a performance boost as high as 22% on the average.


In this pseudo-DirectX 9.0 benchmark that uses vertex programs of version 2.0 along with version 1.1 of pixel programs, the alignment of forces is not that straightforward. In low resolutions, the ATI board outperforms NVIDIA GeForce FX 5900 with Detonator 45.23 at both gaming tests at a small gap, but ForceWare 52.16 improves the situation. At high resolution, on the other hand, NVIDIA GeForce FX 5900 with both old and new version of the driver outperforms ATI Radeon 9800. Note that the performance boost attained due to the replacement of drivers is negligible, which is a bit strange considering that synthetic tests showed practically identical performance boost for both version 1.1 and version 2.0 pixel programs, so we can't attribute the small percentage in performance gain to the transition from Detonator 45.23 to ForceWare 52.16 and explain it by that the benchmark uses version 1.1 pixel programs, not 2.0.


It was interesting for us to follow the results of this regular DirectX 9.0 game in our list of benchmarks solely due to its possibility to forcedly enable whatever version of pixel and vertex programs in the game.
On the whole, the situation of the ATI board is as dramatic as it is for the case of AquaMark 3. With version 45.23, GeForce FX 5900 loses to ATI Radeon 9800, but ForceWare 52.16 comes in handy, and the situation radically changes both after using version 1.1 pixel programs and version 2.0 of the programs.
Another fact is also remarkable. If we look at the absolute FPS values for ATI Radeon 9800 using pixel and vertex programs of versions 2.0 and 1.1, we can see that the performance drop during transition from version 1.1 pixel and vertex programs to version 2.0 is quite insignificant. With NVIDIA GeForce FX 5900 using Detonator 45.23, the drop is there and it is quite essential. But once we estimate the figures produced with ForceWare 52.16, we can claim absolutely the same (in terms of percentage) performance drop during transition from version 1.1 to version 2.0 of pixel and vertex programs in NVIDIA GeForce FX 5900 as that seen in ATI Radeon 9800, which again is indicative of excellent job the programmers at NVIDIA have done. In fairness though, it's worth noting that we haven't noticed any slightest difference in quality in using both versions of pixel and vertex programs.


I think it won't be an overstatement if I say that we EXPECTED the beta/alpha or whatever resembling Half-Life 2 would leak to the Internet =)). It's none of our business to comment on how the beta leaked to the Net. We are more interested in having a real DirectX 9.0 application that makes use of all the API potentials to the full and is in fact an outline of future DirectX 9.0 games. It's an ungrateful job commenting a very raw beta, since all may (and most likely will) radically change in the final release, but anyway let's get round to it =).
The Half-Life 2 engine is just the pure HLSV which doesn't bode any good to NVIDIA video cards. As our tests performed with two demo benchmarks (for which a special personal thankyou to Andrey Vorobyov who kindly presented the demo reels for tests) showed, NVIDIA card simply proved to be crushed by the ATI counterpart. Although ForceWare 52.16 shows a large performance boost, it doesn't improve the situation at all.


As is easy to see, antialiasing is an easy job for NVIDIA chips in both Direct3D and OpenGL applications. The new NVIDIA ForceWare 52.16 merely accentuates that.
Image quality: Anisotropic Filtering 8x


The AF operation speed in Direct3D (Unreal Tournament 2003) for ATI Radeon 9800 and NVIDIA GeForce FX 5900 with Detonator 45.23 is about the same. But we produced a rather strange result when using ForceWare 52.16 - the performance dropped. Unfortunately, we had no chance to perform a repeated run of the test, so this incident remains unexplained. But at OpenGL (Return to Castle Wolfenstein), NVIDIA boards are traditional leaders.
Image quality: AntiAliasing 4x + Anisotropic Filtering 8x


In the long run, we establish a sure victory for NVIDIA.
Image quality: AntiAliasing 6x/8x + Anisotropic Filtering 8x/16x


In addition to the traditional testing for image quality, we decided to add the so-called "maximum quality" mode at which the maximum possible AF and AA modes came into play. For ATI chips, the maximum AA level was equal to 16x, that for AA - 6x. For NVIDIA chips, it was 8x for both modes, respectively.
Let's see the results. At Unreal Tournament 2003, NVIDIA is a sure leader, and at Return to Castle Wolfenstein ATI hold the same evident victory. We also can't help noticing the absolute FPS values in the games - they are at a level acceptable for the games.
The release of new ForceWare 52.16 driver has essentially changed the alignment of forces both in the high-end boards and in the mid- and low-end niches of the market. At that, the ASUS V9950 (NVIDIA GeForce FX 5900) graphic board reviewed today is the most indicative.
Synthetic benchmarks that make active use of pixel and vertex shader processing techniques unanimously report of a substantial rise in operation speed for NVIDIA GeForce FX 5900 (as well as all the boards of the GeForce FX family) at handling shaders. The rise is especially noticeable in processing shaders of version 2.0, which has always been a bottleneck for boards built on NVIDIA GeForce FX chips. Nevertheless, at synthetic benchmarks GeForce FX 5900 boards even with ForceWare 52.16 installed can't compete on par with ATI Radeon 9800. Note that such an alignment of forces in synthetics is caused primarily due to that the synthetic benchmarks used by us are built on the Microsoft HLSL (High Level Shader Language), while ATI boards, as we mentioned it in the theoretical part of our review, handle them much more efficiently than NVIDIA boards do for which the ideal option is the customized approach in writing shader programs for the architecture of GeForce FX boards. Boards built on the GeForce FX chips handle the standard DirectX 9.0 code much worse than ATI Radeon boards do, and the new NVIDIA drives doesn't change the situation radically, but merely reduces the gap slightly.
In real-life applications, as our tests showed, the situation for NVIDIA is much more favorable in view of the released ForceWare 52.16 driver. Sometimes, it's just the installation of a new driver allowed the NVIDIA GeForce FX 5900 board to take the crown and leave ATI Radeon 9800 somewhere in the middle between the results for NVIDIA GeForce FX 5900 (with Detonator 45.23 and ForceWare 52.16). Due to known reasons, shader applications here are the most indicative. But we would like to note that it's the wide popularity of benchmarks and games used as benchmarks in our research that played one of the leading parts.
What matters here is NVIDIA's program of cooperation with game developers (dubbed "The Way it's Meant to be Played") which is aimed at intense work with game developers to make the gaming engines more amenable to optimize for NVIDIA cards architecture. It that good or not? There is no one-one answer and can't be at all. The end user/gamer who is not into the details absolutely doesn't care how the maximum performance is attained by this or the other video cards manufacturer. By and large, it doesn't matter whether it will be achieved through tricky optimizations, or if the performance is originally high in the very architecture of the GPU. What really matters is the image quality. But here is another problem coming up: no matter how energetic NVIDIA is working with game developers, the company is unable to grasp ALL the developers who in turn would have to resort to writing the HLSV code which, as the practice of writing real gaming applications has shown a number of times, executes faster with ATI boards. That is why, in our view, NVIDIA chose a bit wrong policy in this case. There are many examples of that. Take, for instance, Half-Life 2 at which NVIDIA boards demonstrate appallingly low performance and rank on par with ATI's middle-end solutions Needless to say, Half-Life 2 is, without overstatement, a framework of future DirectX 9.0 games, and NVIDIA worked in quite close touch with Valve to optimize the game for the GeForce FX architecture. Of course, we can't claim to what extent the leaked beta version is optimized for NVIDIA boards or whether it is optimized at all, but it is a fact that GeForce FX 5900 is an ignominious failure at all the tests performed on the base of beta Half-Life 2 and it lags far well behind the ATI's contender. It is also a fact that game optimization for NVIDIA video cards is in practice a very troublesome job, which once again confirms the formerly made conclusions that NVIDIA can't artificially tune the performance of its cards to the "ATI's level".
With the release of new games that make increasingly intense use of version 2.0 pixel and vertex shader processing techniques, NVIDIA boards based on GeForce FX chips will look the more unconvincingly as compared to ATI boards (the examples are numerous). In our view, the most sound decision for NVIDIA would be the release of a new chip with the architecture pre-optimized for Microsoft HLSV (the issue of floating-point precision still remains open). To date, we can say for sure that all the owners of NVIDIA GeForce FX vide cards must install the new NVIDIA ForceWare 52.16 driver (the analysis of new versions of the driver, including unofficial ones - read in our further reviews), since the driver does optimize just the compiler itself and does not make any optimizations for a particular application, which is confirmed not only by gaming applications but by the synthetic benchmarks as well.
| |||||