On the Worldwide Supercomputer (ISC 2022) commerce present, HPE demonstrated blade methods that may energy two exascale supercomputers set to return on-line this yr — Frontier and Aurora. Sadly, HPE had to make use of refined and power-hungry {hardware} to get unprecedented computing efficiency. Therefore, each machines use liquid cooling, however even huge water blocks can not disguise some fascinating design peculiarities the blades characteristic.
Each Frontier and Aurora supercomputers are constructed by HPE utilizing its Cray EX structure. Whereas the machines leveraged AMD and Intel {hardware}, respectively, they use high-performance x86 CPUs to run normal duties, and GPU-based compute accelerators to run extremely parallel supercomputing and AI workloads.
The Frontier supercomputer builds upon HPE’s Cray EX235a nodes (opens in new tab) powered by two AMD’s 64-core EPYC ‘Trento’ processors that includes the corporate’s Zen 3 microarchitecture enhanced with 3D V-Cache and optimized for top clocks. The Frontier Blades additionally include eight of AMD’s Intuition MI250X accelerators (opens in new tab) that includes 14,080 stream processors and 128GB of HBM2E reminiscence. Every node affords peak FP64/FP32 vector efficiency of round 383 TFLOPS and peak 765 FP64/FP32 matrix efficiency of roughly 765 TFLOPS. Each CPUs and compute GPUs utilized by HPE’s Frontier blade use a unified liquid cooling system with two nozzles on the entrance of the node.
The Aurora blade (opens in new tab) is at the moment referred to as identical to that, carries an Intel badge, and doesn’t have HPE’s Cray Ex mannequin quantity but, presumably as a result of it nonetheless wants some sprucing. HPE’s Aurora Blades make the most of two Intel Xeon Scalable ‘Sapphire Rapids’ processors with over 40 cores and 64GB of HBM2E reminiscence per socket (along with DDR5 reminiscence). The nodes additionally characteristic six of Intel’s Ponte Vecchio (opens in new tab) accelerators, however Intel is quiet concerning the precise specs of those beasts that pack over 100 billion transistors every (opens in new tab).
One factor that catches the attention with the Aurora blade set for use with the two ExaFLOPS Aurora supercomputers (opens in new tab) is mysterious black containers with a triangular ‘sizzling floor’ signal situated subsequent to Sapphire Rapids CPUs and Ponte Vecchio compute GPUs. We have no idea what they’re, however they might be modular refined energy provide circuitry for extra flexibility. In any case, again within the day, VRMs had been detachable (opens in new tab), so utilizing them for extremely power-hungry parts would possibly make some sense even right now (assuming that the proper voltage tolerances are met), particularly with pre-production {hardware}.
Once more, the Aurora blade makes use of liquid cooling for its CPUs and GPUs, although this cooling system is solely totally different from the one utilized by Frontier blades. Intriguingly, it seems like Ponte Vecchio compute GPUs within the Aurora blade use totally different water blocks than Intel demonstrated (opens in new tab) a number of weeks in the past although we will solely surprise about potential causes for that.
Curiously, the DDR5 reminiscence modules Intel-based blade makes use of include fairly formidable warmth spreaders that look greater than these used on enthusiast-grade reminiscence modules. Retaining in thoughts that DDR5 RDIMMs additionally carry an influence administration IC and voltage regulating module, they naturally want higher cooling than DDR4 sticks, particularly in space-constrained environments like blade servers.