| 3DNews Vendor Reference English Resource - All you need to know about your products! |
||||||
![]() |
||||||
|
|
||||||
AMD Radeon HD 3800 Series. Tests of the "low-end" HD3850Date: 04/12/2007
By: Dmitry Sofronov, Vladimir Romanchenko. Almost half a year has passed. To be more precise, there has passed half a year and one day, and AMD has officially announced a new generation of 3D graphics called ATI Radeon HD 3800 family. Today, the time for shuffling rumors on RV670XT chips has come to an end, and now it's high time we got acquainted with the first-hand official information. Even more than that, today we are presenting not only the details of the new Radeon HD 3800 architecture and its comparison versus the previous Radeon HD 2000 version, but now we also have the opportunity to estimate the performance of a real specimen based on the new chips in real applications.
![]() To date, the new Radeon HD 3800 family is presented by the new two processors - Radeon HD 3870 and Radeon HD 3850. Don't be confused by the unusual digital indices of the chips. The thing is, from this GPU generation onward AMD has announced a migration towards a new principle of marking: the first digit in the index stands for the generation, the second one – for the processor family, with the remaining two left for marking the performance of the chip - the higher the performance, the greater is the two-digit number.
![]() Therefore, the first digit in the marking of Radeon HD 3870, for instance, stands for the third generation of the Unified Shader architecture (the second generation are ATI Radeon HD 2900 chipset, while the first generation was implemented in the Xenos chip for the XBOX 360); the second digit stands for the family (we can assume that if there is "8", why shouldn't "5", "6", or "3" appear, with memory buses of various widths and/or "cut-down" versions of the core), and the last two digits imply manipulations within the family.
And what about the development engineers? To be brief, the key innovations in the new generations of Radeon HD 3800 chips, which make them stand out from the previous generation R600 are a migration to the finer, more precise 55-nm process technology with a simultaneous reduction in the overall number of transistors; emergence of support for DirectX 10.1, the PCI Express 2.0 bus, and ATI CrossFireX configurations. Along with these, the new generation of chips offers a Unified Video Decoder substantially reworked as part of ATI Avivo HD, as well as the power-saving technology ATI PowerPlay first implemented for desktop PCs. That is really in brief. To make the key innovations of the Radeon HD 3800 series and its distinctions from the predecessors look more vivid, we bring them in as a comparative table.
We now can move directly to a detailed examination of the novelties of the Radeon HD 3800 family, and we first should note that the key architectural specifications of the RV670 graphic processors have remained without any special changes. The base of the graphic core is the same unified shader architecture. It is based on the so-called dispatch processor that distributes task threads into 64 superscalar streaming processors, each in turn R600 is equipped with a new dispatch processor that distributes streams among 64 superscalar streaming processors, each having 5 independent discrete stream processing units and branching units, which altogether makes up 320 shader unified processors in 5-component SPUs (Streaming Processing Units) which provide processing up to 5 Multiply-Add instructions per cycle, plus one of the processors is able processing more advanced instructions (SIN/COS/LOG). The dynamic balance of loading the shader processors with vertex, geometrical, and pixel operations is provided automatically by the hardware scheduler.
![]() There is a complete analogy with R600 that is also seen in the same texture units which are in charge of fetching texture and vertex data – they are also four, each having four fetching units and four addressing units thus making 16 altogether; there is the same number of ROP – also 16. There is also analogy in the organization of two-tier L1 and L2 caches. For details of the fundamentals of architecture, read our article Radeon HD2900XT – hopes that haven't come true, and fundamental innovations, and we go on talking about the innovations of this summer. Finishing with the narration of the ATI Radeon HD 3800 core, we have to admit that its "cycle-wise" performance is declared at a level equivalent to that of ATI Radeon HD 2900. I note it separately that the design of the new core offers a 256-bit memory interface with an internal 512-bit circular bus. Indeed, the previous core had a twice wider 512-bit memory interface with the internal 1024-bit circular bus, however, such a reduction of internal and external performance of the memory controller has been made up for in RV670 with other amendments. The 8-channel 64-bit mode of memory organization for this chip is unavailable, but the lack of 4-channel 64-bit mode is more than compensated by the efficient 8-channel 32-bit mode, combined with higher clock speeds of the memory in higher-end card of the new family of chips, plus improvements in the arbitration logic of the memory controller. At that, we would like to recall the substantial reduction in the number of transistors: they are 666 mln in the new RV670 core versus 700 mln in R600. That is, about 34 mln less than that with the functionality preserved and even a number of novelties implemented. At the stage of tests, we even put forward an idea that engineers at AMD decided to get rid of a hardware tesselation unit within the chip. Basically, giving up the use of hardware tesselation which makes sense only if game producers support that and provided there is a mandatory support for Microsoft DirectX in the list of requirements would be justified at this stage. However, the new core has a programmable tesselation module, and the matter of sharp reduction in the number of transistors with a vivid gain in functionality is still open.
![]() At least, engineers at AMD asset that these are the consequences of successful redesign of the architecture. Indeed, for half a year that has passed since the announcement of R600, engineers at AMD have spent the time efficiently at elimination of architectural bottlenecks, and not only that. Among the key changes in the architecture, we already mentioned a migration to the 55-nm process technology. If there were no other changes, the redesign from the previous 80-nm process technology would appear to be a good move towards the reduction of power consumption in new GPU. At AMD, they also accentuate the fact that the PowerPlay power consumption technology has been implemented in the core of RV670, but we can regard it as new only for the "desktop" graphic chips, since it made its debut earlier in a number of solutions for mobile PCs, like ATI Mobility Radeon HD 2300, HD 2400, and HD 2600. Recall that the ATI PowerPlay technology provides a reduction of clock speed as the load goes down and even "powers off" the temporarily idle elements of the chip. Therefore, substantial reduction in the TDP of Radeon HD 3800 family chips as compared to the predecessors should give credit to the migration to the new 55-nm process technology, TSMC with its successful struggle against the leak currents, as well as the redesign of chips, and addition of all the power-saving capabilities to the architecture. Emphasizing the novelties implemented in the ATI Radeon HD 3800 family, we should mention support for DirectX 10.1, thus with support for the Shader Model 4.1. This first and substantial enough update to DirectX 10 is anticipated with the advent of Windows Vista SP1 in about the first quarter of 2008. The list of additional capabilities of DirectX 10.1 is long enough - there will be advanced features for programming, lighting, anti-aliasing, etc. Video cards based on Radeon HD 3800 are ready for these novelties - it is now turn of the game and application producers.
![]() Another important innovation which is still in the plans for most of us is about migration to the PCI Express 2.0 bus. We should assume that demand for video cards offering the advanced functionality will be generated with the advent of new motherboards that offer support for the PCI Express 2.0 bus, in complete analogy with the migration from AGP 8x (which is still going on). In terms of that, Radeon HD 3800 chips are ready for such a migration. Today, implementation of the more advanced ATI Avivo HD technology with the Universal Video Decoder in Radeon HD 3800 family chips is more topical. The UVD/UVD+ option integrated in Radeon HD 3800 chips provides a full-featured hardware decoding of the HD signals of H.264 and VC-1 formats (the VC-1 was decoded still in the previous UVD version), which is most topical for unloading the CPU while playing back modern HD DVD/Blu-ray video.
![]() Along with other traditional output capabilities, video cards on new chips offer support for the HDMI and HDCP, with a full-fledged support for HDMI high-resolution displays - the protected content can be played on screens up to 2560x1600. Finally, the ATI CrossFireX technology allows support for up to four (two, three) video cards of the Radeon HD 3800 family on a single motherboard, support for up to 8 monitors; with coordinated control of performance, Extended Desktop mode, and other interesting features. As an example of operating 8 monitors, we bring in demo video clip recorded from Microsoft Flight Simulator.
To be fair, we should note that it makes sense talking about ATI CrossFireX in more detail only once all the platform components which support this operation have appeared. Nevertheless, we should already now point to such interesting novelties like unlocking the GPU with ATI Catalyst; modes of protection and rolling back to safe modes; manual setting of clock speeds for the core and the memory, as well as the auto-configuration feature that enables "safe" overclocking within stable limits. By the way, configuring the ATI CrossFireX with ATI Catalyst 7.10 provides handling both DirectX 9 and DirectX 10 applications in the "Compatible AFR" mode.
![]()
This is about all the theoretical information which we would like to share prior to moving to the practical tests of a real video card in real applications. Apart from the re-branding of ATI logos which has been announced today, we should mention the new name Radeon HD 3870 X2 that dashed cursorily in the new specifications. That lets us hope for new graphic cards with a dual chip to appear in the forthcoming year.
![]() As a final touch to this introduction, we are presenting a comparative table of specifications for video cards built on the new chips whose start of deliveries was promised to commence as of today onwards.
![]()
Radeon HD3850 offers a compact cooling system and takes up one standard PCI slot. The card is equipped with two Dual-Link DVI connectors and each supports resolutions up to 2560?1600 dots. The video card that arrived at our test lab is an engineering sample, so it was not equipped with a box or accessories, - only a DVI/HDMI adapter was put in the package bundle.
The cooler hides most part of the PCB, but we can note something already now. The needle-shaped radiator to the right is meant to cool the power components of the power supply subsystem. Inside the cooler housing, from the side of the needle-shaped radiator there are no notches, since it is cooled in the passive mode. However, for Radeon HD3850 and its modest appetites in terms of power consumption that is quite sufficient. The central part of the radiator contains a great number of cooling fins. Since the cooler is made in the single-slot design, the heated up air is not brought outside the video card but is expelled towards the DVI connectors, that is from right to left or upwards.
On the reverse side of the PCB, there are numerous fine components, but there aren't memory chips, there is even no pad for them.
Let me explain why. It turns out that all the eight memory chips are on the front side of the PCB and are positioned at an angle around the GPU. Evidently, this design was chosen because of the striving towards unification of the HD3850/70 series products and the reduction in their prime cost. Since both the video cards are based on the same GPU, the desire of engineers at AMD to use a single PCB for both the products looks logical. The positioning of memory chips on the same side of the PCB allows for their effective cooling using a regular cooler, and the increase in the video memory capacity can be attained through use of memory chips of greater density. As was already stated, the power consumption of Radeon HD3850/70 is relatively small, so the power supply subsystem of the card looks rather modest.
The nominal cooler for HD3850 looks pretty simple. The radiator appears to be a solid metal piece and, contrary to the usual, it does not have a central part for the GPU. Nor there are heat pipes either. The memory chips and the power components of the power supply system are cooled with thermal spacers. As was found out later, the radiator despite its color is made of an aluminum alloy (I anyway did scratch it). As regards the impeller, it is absolutely noiseless in the nominal mode since the rotational speed is very small. The noise turns more immense if the fan speed is set at a level about 50% of the maximum.
![]() The core of the video processor RV670 stands out with its moderate dimensions despite the impressive number of transistors. The fine process technology makes itself felt. The nominal and recommended GPU operating frequency on HD3850 equals 670 MHz. The chip was made on the 39th week of year 2007 (in August).
![]() The overall capacity of the video memory on board HD3850 is 256 MB. Used are eight GDDR3 chips made by Samsung. The access time is 1.1 ns, which is equivalent to the nominal frequency 1800 MHz DDR. However, the nominal operating frequency of the video memory is lower and equals 1660 MHz DDR. As you see, there is a good margin for overclocking, especially if the memory like that is capable operating easily at 2000 MHz DDR. And now we'll find out how the performance of HD3850 and its overclocking capability meet our expectations.
Preparing for the testPrior to running the tests, we are bringing in some useful information related to the specimen being tested.
![]() Here you can see the screenshot from ?atalyst Control Center with information on the driver version.
![]() This a screenshot of the "hardware" tab (all the parameters did not fit on a single screen, so we had to use a graphic editor to present the information in a compact way). If we look closer, we can see an amusing incident - CCC reports that the video card is running on the PCI-Express 2.0 bus although we have never had it at our test lab. Also here we can see the nominal speeds of the GPU and the video memory.
![]() The ATI Overdrive section displays the limit frequencies for the 3D mode which can be set for HD3850 using the CCC. The GPU limit is set at 770 MHz, which meets the nominal frequency of the higher-end model - HD3870. The limit frequency for the video memory is set at 1117 real MHz (or, 2234 MHz DDR), which is also very close to the nominal frequency for the video memory of the higher-end "sister". Isn't that an amusing coincidence? Also here, we can see the temperature of the video card in the 2D mode which amounts to 60°. In the 2D mode, the GPU frequency shows an almost twofold drop and sets at merely 300 MHz. This is the power-saving technology in action. Basically, the temperature of the GPU under load can be viewed again using ???, but that is more convenient to do using the RivaTuner utility, especially that it is already "aware" of HD3850.
![]()
![]() As you can see, under load the GPU temperature goes almost up to 90 degrees. That happens because the cooler does not increase the RPMs and the video card remains absolutely noiseless. Perhaps, this figure is acceptable for engineers at AMD, since that does not affect the stability of the video card at all.
OverclockingClearly, it is advisable to improve cooling for better overclocking. Unfortunately, we had no enough time to search for the maximum frequencies with the comfortable noise level preserved. That is why we did in a different way. Using the RivaTuner utility, we set the cooler speed to 100% with the frequencies raised to 770/2000 MHz. And the video card did work fine at these frequencies, thus confirming its superb overclocking capability. Quite possibly, even that is not yet a limit, but now let's dwell at these frequencies in overclocking. Well.. Now let's see how things are with the performance. We now explore the performance of Radeon HD3850 at both old and new games. The "standard" kit of games which we use has been complemented with these – World in conflict, Crysis Single Player Demo, and NFS Pro Street. And that is not the only change in the procedure of tests. Where it was possible to indicate the minimum FPS, we showed it with respective remarks. As regards the rivals of Radeon HD3850, we decided to stop our choice on the two video cards - HD2900XT and 8800GT. We did not conduct more extensive tests again because of the shortage of time, but we promise to fill in the gap in our forthcoming reviews. Comparison versus HD2900XT is interesting in terms of the effect of the memory bus width which is more than two times different in these video cards. As we investigated formerly, HD2900XT is not always capable of making the most of the video memory bandwidth, but will cutting down the memory bus width affect HD3850? And, how balanced it may turn out? Comparison versus 8800GT is interesting for another reason. First, these cards are the two major novelties of this autumn. Whatever AMD and NVIDIA will do to position them in the eyes of the buyers, they will certainly remain as direct competitors. We already saw the results for overclocked 8800GT by Gigabyte which is capable of competing even versus 8800GTX. Today, we are comparing HD3850 versus 8800GT running at the nominal frequencies.
We ran the tests using the ForceWare 169.04 and Catalyst 7.10 drivers. The results for Radeon HD3850 are displayed on the diagrams with bright red. The results for Radeon HD3850 overclocked to 770/2000 MHz are displayed in orange. And the results for HD2900XT and 8800GT – in dark red and green colors.
3DMark
In 3DMark, at the nominal speeds the video card HD3850 is on par with 8800GT but is inferior to HD2900XT. At the same time, overclocking does not allow HD3850 to bypass HD2900XT. Does the advantage of the latter at the memory bus make itself felt? The lag of HD3850 behind HD2900XT at the rated frequencies is easy to forecast. With the similar architecture, the novelty has a bit lower GPU frequency and of course twice as little memory bus width. So we are no commenting on this fact further.
Elder Scrolls IV - Oblivion
![]()
![]()
![]() At Oblivion, 8800GT takes a sure lead over both the video cards by AMD at even the rated frequencies. At the same time, the 256-bit memory bus is not a hindrance. To all appearance, the cause is in the sensitivity of the game to the performance of the shader unit. The guesswork is indirectly confirmed by the fact that HD3850 is superior to HD2900XT at overclocking.
Need for Speed Carbon
![]()
![]()
![]() The game Need for Speed Carbon is also very critical to the speed of shader execution, so 8800GT again leaves all the others well behind. At the same time, the results for HD3850 catch up with those for HD2900XT only on overclocking. But the value of the minimum FPS in HD3850 remains a bit worse than in HD2900XT. On the other hand, this lag is not fatal and even at 1600?1200 we felt no discomfort during the game.
Serious Sam 2
![]()
![]()
![]() At Serious Sam II, 8800GT is of course a leader at the average FPS, but at low resolutions its minimum FPS is somehow worse than in video cards by AMD. In fact, at 1600?1200 all clicks into place, except the drop of minimum FPS in the overclocked HD3850.
Quake 4
![]()
![]()
![]() At Quake 4, the novelty by AMD demonstrates superb results and at overclocking it even leaves 8800GT behind, which didn't use to be successful before.
Prey
![]()
![]()
![]() However, at Prey built on the same engine as Quake 4, the gap in results is already in favor of 8800GT. On overclocking, HD3850 reaches the performance of HD2900XT but it is already unable to reach 8800GT.
F.E.A.R.
![]()
![]()
![]() At F.E.A.R., the AMD's novelty shows a really decent result, especially in view of the fact that it has merely 256 MB video memory onboard, but in these test modes the game demands much more. We believe, this is just the reason the minimum FPS in HD3850 is much lower than that of the other participants of the tests.
Call of Juarez
![]()
![]()
![]() In this test at Call of Juarez, the full-screen antialiasing did not work for some reason, despite all the contrivances. So, we present the results produced in the No AA/AF mode. HD3850 shows itself quite decently, however the performance somehow lags with the rise of resolution. Now we move on to tests in new modern games.
The running of tests in the below games is more likely an express test. The motives for choosing just these graphic modes in which we ran the tests are more likely intuitive rather than based on precise calculations. However, even these results can be quite indicative. We are already preparing a detailed investigation of the system requirements and analysis of traits of the new games, but now we are introducing you to the first results. These results can be referred to as preliminary for the simple reason that two of the three games are still at the beta version stage.
World in ConflictThe game World in Conflict has an integrated performance test, and it's just these results which are displayed in the below diagrams. Since the game is new and highly abundant with graphics and objects, we did not select the maximum graphic quality mode but stopped at the “High” level defined from the graphic settings menu. In this mode, the FSAA 2? mode is already enabled.
![]() As you can see, although the settings are not the maximum, the average FPS is not high. 8800GT and HD3850 are going on par, with HD2900XT setting the pitch.
![]()
![]() With the selected graphic mode, the game is demanding for the video memory capacity. Therefore, with the rise of resolution the performance ratio between 8800GT and HD2900XT remains unchanged since they both are equipped with 512 MB video memory. But HD3850 starts lagging behind them substantially. Like it or not, but 256 MB is a bit too little for modern games in quality modes.
NFS Pro Street DemoWe decided to include Need for Speed Pro Street Demo into the list of tests as a future successor to NFS Carbon, and also for a variety of shooters. All the graphic settings, except the full-screen antialiasing mode, were set to the maximum. We measured the FPS using the FRAPS utility. The time of tests - two rounds over the circular track. And here the HD3850, overclocked for the first time, failed. Some time after the start of the game, the “VPU Recover” error message popped up. We did not succeed in riding two full rounds with the overclocked HD3850, so the results are missing on the diagrams.
![]()
![]()
![]() Interestingly, HD3850 and HD2900XT show almost identical results. But 8800GT demonstrates an absolutely abnormal drop of performance, which does not depend on the resolution. It is hard to tell the reason for such behavior of results. Perhaps NVIDIA has not yet optimized the drivers for this game. Maybe the game is a bit raw - it is still beta. As we remember, at NFS Carbon the video cards by NVIDIA first did not show their best, but then all broke even. So we are attributing all to the beta status of the game and get round to the most exciting part.
Crysis Single Player DemoHow we ran the tests in Crysis. We first took Crysis Single Player Demo which was released in October and appears to be the first tier of the game. Secondly, we tested manually using the FRAPS utility. Since the game is quite hard even for powerful video adapters, we decided to switch of the graphic settings to Medium. There isn't much practical sense in comparing 10 FPS versus 12 FPS, but at the medium graphics settings you can play quite comfortably. We'll be examining the system requirements for Crysis after release of the official version, but now we're reporting our moves in this Single Player Demo.
![]() This screenshot shows a view in the original point of the test. On the hillock in the distance, there is a hostile hut with a motorway passing around and opening a magnificent view on the sea, palms, and ships. Prior to saving at this point, we "cleared" all the enemies so as not to distract the tester from the race. Then all goes on simple – keep your head straight, press the Shift key and off you go. In fact, the race itself looks very simple – from the starting point we run up to the hut, around it over the terrace and then back to the original point.
![]()
![]() Surprisingly, but 8800GT at first loses to AMD video cards at this test. At the same time, HD3850 and HD2900XT go on par smoothly, and only with the rise of resolutions the HD3850 starts giving in. Most likely, it is the shortage of video memory that makes itself felt. On the whole, the results can be referred to as good. Especially for mid-end products like HD3850 and 8800GT.
Effect of the memory busFinally, we bring in one more graph that shows the effect of the memory bus width in HD3850 and HD2900XT. The architecture of video processors in these video processors is almost identical, the same is about the number of functional units (if we disregard the UVD unit which is not related to image rendering in 3D). Although the frequencies are different, we can make them equal. That's what we did. The methodology remained the same as it was in the review "512-bit video memory bus width for modern GPU". The GPU frequency on HD3850 was set to 740 MHz and then we built a graph of FPS dependence versus video memory for Quake 4 at 1280?1024 and for two video modes – No AA/AF, and 4AA/16AF. We selected not the highest resolution so that the smaller video memory capacity in HD3850 did not affect. You can see the result below.
![]() If we compare the results for HD3850 and HD2900XT in the same modes, there is a difference of course. The lower the video memory frequency, the more vivid is the difference. But at frequencies close to the nominal the difference in performance between HD3850 and HD2900XT appears to be minor because HD2900XT is unable to leverage the advantage of the 512-bit memory bus. In this sense, the HD3850 appears to be more balanced. Judging by the graph, we can expect a performance gain on overclocking both the GPU and the video memory, which should please thrifty fanciers of overclocking. As regards the migration from No AA/AF to 4AA/16AF, there aren't great changes over here. This test shows that for both HD2900XT and HD3850 we see an almost two-fold drop in results. Actually, this looks logical if there are no radical differences in the architecture.
Final WordsAs the first introduction of HD3850 has shown, this series of products looks highly promising. The lower-end card which we have so far tested is striving to expel the current residents of the middle-end sector into the lower pricing range. With merely a 128-bit memory bus, they will have a very hard time competing against the power of RV670 and its 256-bit memory bus. That applies to both HD2600Pro/XT and 8600GT/GTS. Certainly, all that holds true provided AMD and its partners are able holding the declared recommended prices and filling the market with sufficient number of these promising products.
Now let's imagine how an ideal video card based on RV670 would look in the regular user's viewpoint. In reality, that will be something average between HD3850 and HD3870. Let me explain why. The higher-end model will be equipped with GDDR4 video memory and run at 2.25 GHz clock speed. However, as we saw it on the example of HD2600XT with various types of video memory, the higher frequency of GDDR4 is made up for by the longer latencies, so in the end the overall performance appears to be almost the same. Let's take the risk to assume that we'll see the same situation also with HD3850/70. On the other hand, modern games are highly demanding for the video memory capacity, therefore, it makes sense to give preference to video cards of 256 MB and not 512 MB memory. Then, RV670 offers quite a good performance and good overclocking capability. So, on the ideal card we wish we saw a powerful but quiet cooling system like HD3870. In the end, we get the following kit – GPU RV670 at about 800 MHz, 512 MB of GDDR3 video memory running at about 2000 MHz DDR, plus a powerful and quiet two-slot cooler. And the price should not be higher than that for HD3870. I wonder how soon such "hybrids" of HD3850/70 will appear in sales. So, ladies and gentlemen, make your bets. - Discuss the material in a conference
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||