Skip to main content
SHARE
Publication

Matrix Multiply Performance of GPUs on Exascale-class HPE/Cray Systems

by Veronica G Melesse Vergara, Eric K Palmer, Wayne D Joubert
Publication Type
Conference Paper
Book Title
Cray User Group 2022 Conference Proceedings
Publication Date
Conference Name
Cray User Group CUG 2022
Conference Location
Monterey, California, United States of America
Conference Sponsor
Cray User Group
Conference Date
-

The computation of dense matrix-matrix products (GEMMs) is central to many modeling and simulation workloads as well as AI/ML deep learning campaigns. In fact, millions of dollars are spent annually on computing GEMMs, and large model training demands are increasing exponentially. Specialized processors such as GPUs are designed to perform well for these operations. However, the performance of GEMMs on GPUs can exhibit complex behaviors depending on many factors, making it challenging to optimize the performance of GEMMs on these processors. In this study we undertake an examination of GEMM performance on several leading GPU models taken from product lines of GPUs to be deployed in forthcoming exascale computing systems. We show results to illustrate the many factors that can affect performance of GEMMs on GPUs. We then present data collected from a large number of test runs for an example GEMM operation to show the dependence behaviors of GEMM rate on matrix dimensions. Finally, we show results from machine learning-based performance models using novel feature engineering methods to fit the measured performance, providing a potential basis for GEMM performance tuning and autotuning methods for GPUs. Recommendations are also given for how to achieve high GEMM performance on modern GPUs.