GPUs play an important function in delivering computational energy for the deployment of AI fashions, particularly for large-scale pretrained fashions. Attributable to their platform-specific nature, AI practitioners at current have minimal selection in choosing high-performance GPU inference options. Because of the dependencies in advanced runtime environments, sustaining the code that makes up these options turns into difficult.
As a way to deal with these business challenges, Meta AI has developed AITemplate (AIT), a unified open-source system with separate acceleration again ends for each AMD and NVIDIA GPU {hardware} know-how.
With the assistance of AITemplate, it’s now doable to run performant inference on {hardware} from each GPU suppliers. AITemplate is a Python framework that converts AI fashions into high-performance C++ GPU template code for a quicker inference.
As talked about within the firm’s weblog put up, researchers at Meta AI used AITemplate to enhance efficiency as much as 12x on NVIDIA GPUs and 4x on AMD GPUs in contrast with keen mode inside PyTorch. The AITemplate system consists of a front-end layer that performs varied graph transformations and a back-end layer producing C++ kernel templates for the GPU goal. The corporate acknowledged that the imaginative and prescient behind the framework is to assist high-speed whereas sustaining simplicity.
Furthermore, it delivers near hardware-native Tensor Core (NVIDIA GPU) and Matrix Core (AMD GPU) on broadly used AI fashions reminiscent of transformers, convolutional neural networks, and diffusers. At current, AITemplate is enabled on NVIDIA’s A100 and AMD’s MI200 GPU techniques, each of which are sometimes utilized in knowledge facilities for analysis services, know-how firms, cloud computing service suppliers, amongst others.
Supply: AITemplate optimizations, Meta AI
The weblog learns, “AITemplate presents state-of-the-art efficiency for present and next-gen NVIDIA and AMD GPUs with much less system complexity. Nonetheless, we’re solely originally of our journey to construct a high-performance AI inference engine. We additionally plan to increase AITemplate to further {hardware} techniques, reminiscent of Apple M-series GPUs, in addition to CPUs from different know-how suppliers.”