Quad-Core Opteron: architecture and roadmaps
A month ago, in mid-August, our readers were able to read the new about
AMD's announcement of the new generation of server processors Opteron
and the company's preliminary roadmaps on the the release of first
Quad-Core Opteron chips. In fact, first more or less authentic details
of the Quad-Core architecture of Opteron CPUs started appearing much
earlier for instance, still during the Computex 2006 in
June there were the very first news on the debut of quad-core Opteron
processors of stepping F to come up in approximately the first quarter
of 2007 following both the 65-nm and probably 90-nm process
technologies. However, even after the official announcement there is
still neither absolute clarity nor details of the architectural traits
of Quad-Core Opteron chips, there was too little information on that.

That is why journalists were dying with impatience looking
forward to visiting AMD's Moscow even scheduled for 5 September. This
time, Pierre Brunswick, vice president for sales and marketing in
Russia and ex-USSR, Eastern Europe and Turkey, was expected to take
part in the press conference together with Guiseppe Amato, technical
director at the sales and marketing department in Europe, Middle East,
and Africa. A technical expert is always a feast in our place, and this
time our expectations came true. Moreover, during the presentation Mr.
Amato told a lot of interesting and exclusive details on AMD's new
chips and initiatives, as well as answered a number of uneasy
questions, for which our appreciations to him. On the whole, the event
was held at a dignified technical level let alone extensive marketing
droplets in almost every slide criticizing the major competitor. That
is why today we can tell you a lot of new and interesting detail, and
it is up to you to clear the text off the adverts.
During the presentation, Mr. Amato touched upon the following
key topics:
- AMD's next-generation Opteron processors on the base of the
new platform in view of further migration to the quad-core
architecture;
- Comparison of AMD Opteron and Intel Woodcrest platforms in
terms of expediency of specific benchmarking techniques testing varied
parameters to estimate the real performance;
- Advantages of the AMD Virtualization technology;
- Investment protection for end-users.
That is the sequence the report was made up.
Architectural traits of new-generation AMD Opteron processors
Well, we start with the most interesting part - the
composition of 4-core AMD processors which will be dubbed as Santa
Rosa and Deerhound (working names). We
note straight off that to the question "Will 4-core AMD Opteron
processors made following the 90-nm process technology be ever
shipped?" Mr.Amato distinctly responded that end-users would receive
65-nm chips only. Therefore, you can forget about the early forecasts
on the probable release of server 4-core AMD processors following the
90-nm process technology to all appearances, chips like
these will be used at early stages for testing purposes.
The burden of the part of the report devoted to architectural
features of the future Quad-Core AMD Opteron processors was about the
backward compatibility to the current generation of Socket F chips, as
well as novelties that improve the key factor - the
performance-per-Watt.
According to the presented information , despite the increase
in physical size of the chips and substantial reorganization of the
interior architecture, the new 4-core processors will preserve the
former TDP range typical for the previous 2-core Opteron chips; that
is, the TDP of the new chips is promised to be at about 95 W. Along
with that, new processors will offer support for the AMD-V,
i.e.the AMD Virtualization technology.
Among the key technologies implemented in the new 4-core AMD
Opteron processors are:
- Native Quad-Core Design - the "native " quad-core
architecture with four cores on a single substrate
- Enhanced AMD PowerNow! - the extended and improved
power-saving optimization technology that allows for dynamic reduction
of the power consumption by cores up to 75% in the standby
mode
- Direct Connect Architecture allows effectively
eliminating part of the traditional "bottlenecks" of the x86
architecture: direct connection of I/O HyperTransport buses (up to 8
GB/s), real-time communication between processors; integrated memory
controller that effectively reduces the latency and affects the
performance positively; direct connection to DDR2 memory
- Advanced Process Technology improved 65-nm
process technology that uses the SOI (Silicon-on-Insulator), the small
leakage currents allow improving the performance-per-watt and reducing
the heat emission of 32-bit instruction fetch
- Improved branch prediction mechanism
- Out-of-order command execution
- Dual-thread control of 128-bit SSE instructions
- Up to four double-precision floating-point operations per
cycle
- Extensions for processing bit groups (LZCNT/POPCNT)
- Handling SSE extensions (EXTRQ/INSERTQ, MOVNTSD/MOVNTSS)
As an additional advantage of the new 4-core processors, there
is also a balanced effective structure of the cache: 64 K of data cache
and 64 K of L1 cache instructions, 512 K of L2 cache per each core, and
finally the overall distributed L3 cache 2 MB (Santa Rosa?)
and more (4 MB Deerhound?) per CPU.
One of the most interesting and vivid slides of the
presentation is of course the company's roadmap for further generations
of produce, showing the specifications for not only the processors but
chipsets and platforms in general.
As you can see, the new Quad-Core Opteron processors with L3
cache whose release is planned for the next year - to be more precise,
the second quarter of 2007 - will also offer support for the TCP
Offload, be equipped with Gigabit Ethernet, Serial SCSI, Serial ATA II
with support for RAID. The further generation of chips whose release is
planned for 2008 will offer support for the Direct Connect Architecture
2.0 (HT 3.0?), greater cache and a number of other novelties; it will
also support the PCI Express 2, ten Gigabit Ethernet controllers etc.
In terms of the further wave of innovations which are meant to
make the consumer's life easier and improve the performance, Mr.Amato
dwelled on the specifications of Torrenza, Trinity, and Raiden
technologies. For instance, the Torrenza technology
meant to accelerate data processing is based on the Direct Connect
Computing technology, and its implementation will be effected at the
expense of the HTX slot and dedicated hardware accelerators.
The Trinity technology implemented on the
hardware level of the chip will be in charge of improved system
security, implementation of virtualization and improved controllability.
Finally, the reduction of total cost of ownership (TCO) and
extended capabilities of the client equipment, including those gained
through virtualization, will be offered by the Raiden
technology.
As an additional benefit of the AMD Opteron platform, the
presenter brought in the fact that the "life" of the current processor
socket for server chips 1207-pin Socket F - is promised to
last up to 2009, which means the time when AMD decides to implement the
integrated memory controller with support for FB-DIMM in Opteron chips.
Whose processors are faster?
The main idea of that part of of the report devoted to the
matters of right testing sounded about like this: to produce adequate
results of tests in comparing systems, running a single test is not
enough - a series of tests is required. For instance, a synthetic test
can be used to estimate load upon the memory, I/O operation of the
system; nevertheless, such benchmarks are unable to emulate real
applications. That is why at AMD they believe that to measure the real
performance more focus should be placed on using real applications.
Besides, as Mr. Amato stressed, "artificial acceleration" for a
specific architecture is not a rare occasion in benchmarking suites.
Then the audience was shown a series of tests
first, those at which server systems based on the Intel architecture
take a lead, and then those on the base of the AMD architecture. In so
doing, Mr. Amato stressed that those benchmarking suites were made up
on the base of real applications.




At this very point, I am not going to argue with anyone, but
it seems to me that if during the presentation the emphasis is made on
direct contraposition of own development versus the competitor
solutions, then it get the impression that for a persuasive
demonstration of superiority some specific
benchmarking suites may have been pre-selected. By no means I am
hinting to the optimization - but statistics shows there are always
benchmarking suites where a specific architecture is more winning and
confident. Those who are curious about the details may read the Third
Quarter 2006 SPEC JBB2005 Results, but I'd better confine
myself to a mere demonstration of some most expressive slides of the
presentation on this subject. At the end of the article, you can find a
few more links on that.
We also noticed that while showing the peak performance of
server systems, modern benchmarks do not show the most important part
the currently popular Performance/watt indicator, that is,
performance per unit of power consumed. Therefore, benchmarks are not
the only indicator of performance. Moreover, tests that include less
than four threads are not a suitable measure for dual-processor
systems. Nevertheless, after the slogan "don't be misled by the results
of SPECint and SPECjbb", we noticed that comparison of AMD processors
versus Xeon Woodcrest series chips of equal clock speeds brings quite
competitive results. The findings of Mr. Amato are as follows: the TPI
(True Performance Initiative) approach is good, but some random tests
are unpractical.
In terms of the Performance/Watt
To compare performance in terms of the power consumed, Mr.
Amato brought in a series of slides where systems based on Intel
Woodcrest CPU and AMD Opteron series 2000 were opposed. I'd like to
point that out - the "systems" and not just CPUs taken alone with all
the respective "binding items" of the chipset. Regarding that, they
reminded us that the components of the north bridge are an integral
part of the Opteron CPU architecture, due to which the total TDP of the
server system based on AMD chips looks more preferable.
It was also promised that with the implementation of the new
generation of AMD CPU architecture this advantage will become even more
enhanced of course if we don't mind the superiority of
systems based on Intel Xeon processors at benchmarks dubbed as the Intel
Sponsored Results. I bow to the lofty style of the marketing
people at AMD.
In the end, summing it all up, a conclusion was made regarding
systems based on AMD Opteron which were referred to as "systems giving
an optimim reduction in TCO through a strategy of consistent use of
common software, steady migration to new generations of chips,
energy-efficient architecture and use of DDR-2 memory...", etc. By the
way, the below slide showing a roadmap for the start of support for new
memory types in particular, for AMD the support for FD-DIMM
is planned for 2008 - clearly indicates that Intel is given the role of
a "clearing locomotive" in future.
The virtualization technology (AMD-V)
The virtualization technology proposed by AMD was also stated
as a contraposition to the development of Intel. We can't say it helped
better digest the AMD version of the technology, or estimate the
advantages of a specific technology.
Among the advantages of the AMD-V technology mentioned were
(together with the parallel criticism of the Intel architecture)
security provided by own hardware implementation of the Device
Exclusion Vector (DEV), performance provided due to the Direct Connect
architecture, Tagged TLBs that reduce load upon the memory channel
while launching a new virtual machine, as well as specific Nested Page
Tables whose introduction in 2007 should favor faster switching between
virtual machines.
That's how it is getting along. Please don't get me wrong - I
am not at all an opponent of the nice effective slides showing how a
system by one camp smashes the competitor system into pieces - I was a
bit shocked that during the evening I found out no less about Intel
architectures than about AMD architectures. With time, journalists
become insensible to the way when materials are presented as a
contraposition to the competitor's products, but this time it was
really ... aggressive.
In the end, AMD has completed development of the Quad-Core
Opteron processors. Quite soon in the second quarter of
2007, the first 4-core AMD chips manufactured following the 65-nm
process technology at Fab36 in Dresden will start coming off the
assembly line. The new 4-core processors will be electrically
compatible to the generation of 2-core 1207-pin Socket F AMD Opteron
chips (bearing the 4-digit marking), which will provide minimum costs
and protection of investments.
Appendix
References:
 |
Top Stories: |
 |
 |
 |
MoBo:


|  |
 |
 |
VGA Card:


|
 |
 |
 |
CPU & Memory:

|
|