Matrix Multiply Performance of GPUs on Exascale-class HPE/Cray Systems

by Veronica G Melesse Vergara, Eric K Palmer, Wayne D Joubert

Publication Type

Conference Paper

Book Title

Cray User Group 2022 Conference Proceedings

Publication Date

June, 2022

Conference Name

Cray User Group CUG 2022

Conference Location

Monterey, California, United States of America

Conference Sponsor

Cray User Group

Conference Date

May 2, 2022 - May 5, 2022

Abstract

The computation of dense matrix-matrix products (GEMMs) is central to many modeling and simulation workloads as well as AI/ML deep learning campaigns. In fact, millions of dollars are spent annually on computing GEMMs, and large model training demands are increasing exponentially. Specialized processors such as GPUs are designed to perform well for these operations. However, the performance of GEMMs on GPUs can exhibit complex behaviors depending on many factors, making it challenging to optimize the performance of GEMMs on these processors. In this study we undertake an examination of GEMM performance on several leading GPU models taken from product lines of GPUs to be deployed in forthcoming exascale computing systems. We show results to illustrate the many factors that can affect performance of GEMMs on GPUs. We then present data collected from a large number of test runs for an example GEMM operation to show the dependence behaviors of GEMM rate on matrix dimensions. Finally, we show results from machine learning-based performance models using novel feature engineering methods to fit the measured performance, providing a potential basis for GEMM performance tuning and autotuning methods for GPUs. Recommendations are also given for how to achieve high GEMM performance on modern GPUs.

Matrix Multiply Performance of GPUs on Exascale-class HPE/Cray Systems

Abstract

Researchers

Organizations