AMD launches big data-center push vs. Intel, Nvidia

16 November 2021
in: Company News

Powerful new server CPU and a GPU it says tops Nvidia’s best are part of the parade.

AMD has emerged from its long defensive crouch to taking the fight directly to Intel and Nvidia, a bold move but one backed by a company that’s been racking up wins lately.

Coming on the heels of a record-setting quarter, AMD announced new EPYC server CPUs, a new line of Instinct brand GPUs it says are faster in than Nvidia’s best, the next generation of its CPU architecture, and a deal with Meta, formerly known as Facebook.

EPYC Milan-X CPU

AMD CEO Lisa Su introduced the EPYC Milan-X processors, an iteration of its third-generation server processors with a 3D-stacked L3 cache called 3D V-Cache. One problem with increasing cache is you get transistor sprawl and the die gets progressively bigger. 3D stacking reduces the physical size while increasing density.

And the EPYC Milan-Xes are memory-dense. The chips will have up to 768MB of total L3 cache per chip for the 64-core design. So imagine what a dual socket server will be like with 128 cores and 1.5GB of L3 cache. AMD shared benchmarks that showed up to a 50% performance improvement over previous chips.

The chips will come to market in Q1 2022 in 16-, 32-, and 64-core variants, but they are drop-in compatible with the older Naples and Rome-era server sockets. All that’s needed is a BIOS upgrade.

Su noted that EPYC is now designed into the data centers of 10 of the world’s largest hyperscalers.

Instinct GPU upgrade

The second big piece of news was the Instinct MI200 series, the second generation of the company’s GPU accelerators for data centers that use the chipmaker’s CNDA 2 architecture. Like Nvidia, AMD uses its GPU architecture for both gaming and data centers, with alterations for the two markets. CDNA is a compute-focused architecture for data-center and other uses like HPC and AI.

AMD made comparisons between it and Nvidia’s Ampere A100, claiming significant performance gains and density. Independent benchmarks will bear this out, but at least on paper, MI200 looks like an absolute monster.

The first major change over the MI100, its predecessor, is the use of multi-die packaging. Like its CPUs, AMD has broken the GPU up into several chiplets and connected them via high-speed link. The chip uses an updated Infinity Fabric, with 25Gbps links providing up to 100GBps of bi-directional bandwidth between the GPUs, and there are eight links in the MI200, for 800 GBps of bandwidth between the two chiplets.

AMD claims that the Instinct MI200 series provides up to 4.9x higher peak performance for high performance computing workloads (FP64) than the Ampere A100 and 95.7 TFLOPS of peak double-precision matrix floating-point performance to Ampere’s 19.5 TFLOPs peak.

The MI200 along with EPYC processors will be used in the upcoming Frontier supercomputer at Oak Ridge National Labs, which is expected to be the first U.S. exascale supercomputer when deployed in 2022.

Zen 4 microarchitecture

The Zen microarchitecture is what brought AMD back from the brink. Launched in 2017, Zen is currently in its third generation, and Su disclosed information on generation four.

Keeping with its trend of using Italian cities for codenames, Zen 4 will debut in the Genoa family in 2022. It will support DDR5 memory and PCI Express Gen 5, both of which are slowly coming to market. It will also support CXL, a cache coherency link between memory and processors that speeds up transfers between main memory and the CPU.

This will be the generation that breaks socket compatibility. The first three generations of Zen were all compatible, so if you owned a server with an older generation chip, you could upgrade CPUs and all you needed was a BIOS update. Genoa lacks that due to architectural changes.

“When introduced, we expect Genoa will be the world’s highest performance processor for general purpose computing. It’s designed to excel across a broad range of data center workloads from enterprise to HPC,” said Su. Genoa will be built by TSMC using a 5nm process and sporting 96 cores.

Su also announced Bergamo with a special version of the core architecture called Zen 4C, optimized for cloud-native applications and featuring 128 high-performance cores. Yes, 128 cores. Bergamo is socket compatible with Genoa and is on track to ship in the first half of 2023.

Meta deal for a Milan-based single-socket server

AMD has racked up some hyperscaler wins in the past and now has added Meta, formerly Facebook, to the list. AMD and Meta worked together to define an open, cloud-scale, single-socket server designed for performance and power efficiency, based on the Milan processor.

Further details will be discussed at the Open Compute Global Summit later this week.