Wednesday, June 1, 2022
HomeComputer HardwareSecond-Gen Xe-HPC Accelerator to Succeed Ponte Vecchio

Second-Gen Xe-HPC Accelerator to Succeed Ponte Vecchio


With ISC Excessive Efficiency 2022 happening this week in Hamburg, Germany, Intel is utilizing the primary in-person model of the occasion in 3 years to supply an replace to the state of their excessive efficiency/supercomputer silicon plans. The massive information out of the present this yr is that Intel is naming the successor to the Ponte Vecchio accelerator, which the corporate is now disclosing as Rialto Bridge.

Beforehand showing on Intel’s roadmaps as “Ponte Vecchio Subsequent”, Intel’s GPU groups have been pipelining the event of Ponte’s successor at the same time as the primary giant set up of Ponte itself (the Aurora Supercomputer) continues to be being stood up. As a part of the corporate’s 3 yr (ish) roadmap that results in CPUs and accelerators converging with the Falcon Shores XPU, Rialto Bridge is the half that may, when you’ll pardon the pun, bridge the hole between Ponte and Falcon, providing an evolution of Ponte’s design that’s making use of newer applied sciences and manufacturing processes.

Whereas Intel isn’t providing a totally detailed technical breakdown this early within the course of, at a excessive stage the corporate is speaking a bit about specs, in addition to offering a render of the longer term chip that removes all doubt that it’s a Ponte successor, showcasing that it’s comprised of dozens of tiles/chiplets in the identical format as Ponte. The most important change that Intel is speaking about at present is that they’ll be increasing the full variety of Xe compute cores from 128 on Ponte to a most of 160 on Rialto Bridge – presumably by rising the variety of Xe cores in every compute tile.

Absent any concrete particulars on the manufacturing aspect of issues, Intel is a minimum of confirming that Rialto will use newer manufacturing nodes for its building, changing its present mixture of TSMC N7 (Hyperlink Tile), TSMC N5 (Compute), and Intel 7 (Cache & Base) elements. The Intel 4 course of is predicted to return on-line this yr, so utilizing that to improve the Base and Cache would make sense. Ideally, Intel would additionally like to leap ahead on course of nodes for the compute tiles as properly, probably by utilizing this chance to maneuver manufacturing of these tiles to Intel 4 – although we wouldn’t depend out TSMC N4, both.

With that mentioned, on the threat of studying an excessive amount of right into a single renderer, Rialto has one noticeable distinction from Ponte in the case of the compute cores: whereas Ponte used pairs of compute cores with a cache tile in between, Rialto at first look would appear to be utilizing monolithic slabs. This means that Intel has opted to combine the Rambo cache on-die with the compute tiles, and that they’re keen to fab fewer, bigger compute tiles. This does lend some credence to the concept that Intel is taking up compute tile manufacturing (since they already make the cache tiles), however we’ll must see simply what Intel publicizes in a while.

Curiously, Intel can also be promising extra I/O bandwidth for Rialto – although once more, this can be a very high-level (and unspecific) element. Ponte is already one of many first merchandise transport with PCIe 5.0 connectivity, and with PCIe 6.0 {hardware} nonetheless a bit off, this can be extra about on-chip bandwidth than off-chip bandwidth, or concerning the quantity of bandwidth out there between accelerators utilizing Intel’s Xe Hyperlink interconnect.

HBM3 can also be a shoe-in for Intel’s next-generation accelerator, provided that it’s already going into accelerators transport this yr. HPC accelerators nearly stay and die based mostly on reminiscence bandwidth, so we anticipate that it will be the very first thing Intel checked out for Rialto. And it will be in keeping with Intel’s awkwardly phrased “Extra GT/s” since reminiscence bandwidth is commonly measured in gigatransfers.

Lastly, Intel is stating that Rialto shall be based mostly round a more recent model of the Open Accelerator Module (OAM) socket specification, which is especially notable because the subsequent model of OAM has but to be introduced. Absent extra particulars, the largest differentiating issue appears to be supported energy – whereas OAM 1.x permits for modules to attract as much as 700 Watts, Intel is speaking about doing as much as 800 Watts on a Rialto module. Which, for higher or worse, is in keeping with the rise in energy consumption for the best performing variations of the subsequent technology of HPC accelerators, and is a giant issue within the shift to liquid and immersion cooling for high-end {hardware}.

Compute GPU Accelerator Comparability
AnandTech Intel Intel NVIDIA
Product Rialto Bridge Ponte Vecchio H100 80GB
Structure Xe-HPC Xe-HPC Ampere
Transistors ? 100 B 80 B
Tiles (inc HBM) 31? 47 6 + 1 spare
Compute Items 160 128 132
Matrix Cores 1280? 1024 528
L2 / L3 ? 2 x 204MB 50MB
VRAM Capability ? 128 GB 80 GB
VRAM Sort HBM3?  8 x HBM2e 5 x HBM3
VRAM Width ? 8192-bit 5120-bit
VRAM Bandwidth ? ? 3.0 TB/s
Chip-to-Chip Whole BW ? 64 x 11.25 GB/s
(4×16 90G SERDES)
18 x 50 GB/s
CPU Coherency Sure Sure With NVLink 4
Manufacturing ? Intel 7
TSMC N7
TSMC N5
TSMC N4
Type Components OAM 2.0 (800W) OAM (600W) SXM4 (400W*)
Launch Date Mid-2023 (Sampling) 2022 2022
*Some Customized deployments go as much as 600W

Total, Intel is focusing on a 30% improve in “software stage” efficiency with Rialto bridge. Which at first blush just isn’t an enormous acquire, but it surely’s additionally for an element that’s popping out round a yr after the unique Ponte Vecchio. The 25% improve within the variety of Xe cores signifies that most of this efficiency uplift needs to be delivered by the extra {hardware} versus clockspeed adjustments, however since Intel is quoting real-world efficiency expectations versus simply theoretical throughput, we wouldn’t be too shocked if Rialto’s on-paper specs have been a bit richer nonetheless. Intel can also be promising that Rialto needs to be extra environment friendly than Ponte, which at face worth is an inexpensive declare since efficiency needs to be going up quicker than energy consumption.

Per Intel’s roadmap, the plan is to have Rialto Bridge begin sampling in mid-2023. Given Intel’s troubles getting Ponte Vecchio out on time – you continue to can’t get it until you’re Aurora – this might be a surprisingly fast turnaround time for Intel. However on the similar time, since these are pipelined designs with a really sturdy architectural similarity, ideally Intel won’t expertise practically as many teething issues with Rialto as they’ve Ponte. However as at all times, we’ll see what truly occurs subsequent yr when Intel is nearer to delivering their subsequent accelerator.

All Roads Result in Falcon Shores

With the addition of Rialto Bridge to Intel’s HPC plans, the corporate’s present silicon roadmap seems to be like the next:

Each the HBM-equipped Xeon and HPC accelerator traces are set to merge in 2024 with Intel’s first versatile XPU, Falcon Shores. Falcon Shores was first introduced at Intel’s winter investor assembly earlier this yr, and shall be Intel’s first product that takes high-performance CPU and GPU tiles to their logical conclusion by permitting for a configurable variety of every tile sort. Consequently, Falcon Shores encompasses not solely combined CPU/GPU designs, but in addition (comparatively) pure CPU and GPU designs, which is why it’s the successor to each Intel’s HPC CPUs and HPC GPUs.

For at present’s occasion, Intel isn’t providing any additional particulars on Falcon Shores – so the corporate continues to be speaking about focusing on 5x will increase in every part from vitality effectivity to compute density and reminiscence bandwidth. How they intend to perform that, moreover counting on their deliberate packaging and shared reminiscence applied sciences, stays to be seen. However this replace does supply a greater image of the place Falcon Shores will match into Intel’s product roadmaps, by offering a have a look at how the present HBM-Xeon and Xe-HPC merchandise will merge into it.

Finally, Falcon Shores stays as Intel’s energy play for the HPC trade. The corporate is betting that having the ability to ship a tightly built-in (however nonetheless tiled and versatile) expertise with a singular API for all shall be what offers them an edge within the HPC market, placing them forward of conventional GPU-based accelerators. And, if they will ship on these plans, then 2024 is shaping as much as be a really fascinating yr within the high-performance computing trade.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments