Design and analysis of CXL performance models for tightly-coupled heterogeneous computing...

by Anthony M Cabrera, Aaron R Young, Jeffrey S Vetter

Publication Type

Conference Paper

Journal Name

International Workshop on Extreme Heterogeneity Solutions

Book Title

ExHET '22: Proceedings of the 1st International Workshop on Extreme Heterogeneity Solutions

Publication Date

April, 2022

Page Numbers

1 to 6

Issue

Publisher Location

New York, United States of America

Conference Name

International Workshop on Extreme Heterogeneity Solutions

Conference Location

Seoul, South Korea

Conference Sponsor

ACM

Conference Date

Apr 2, 2022

View DOI Listing

Abstract

Truly heterogeneous systems enable partitioned workloads to be mapped to the hardware that nets the best performance. However, current practice requires that inter-device communication between different vendors' hardware use host memory as an intermediary step. To date, there are no widely adopted solutions that allow accelerators to directly transfer data. A new cache-coherent protocol, CXL, aims to facilitate easier, fine-grained sharing between accelerators. In this work we analyze existing methods for designing heterogeneous applications that target GPUs and FPGAs working collaboratively, followed by an exploration to show the benefits of a CXL-enabled system. Specifically, we develop a test application that utilizes both an NVIDIA P100 GPU and a Xilinx U250 FPGA to show current communication limitations. From this application, we capture overall execution time and throughput measurements on the FPGA and GPU. We use these measurements as inputs to novel CXL performance models to show that using CXL caching instead of host memory results in a 1.31X speedup, while a more tightly-coupled pipelined implementation using CXL-enabled hardware would result in a speedup of 1.45X.

Design and analysis of CXL performance models for tightly-coupled heterogeneous computing...

Abstract

Researchers

Organizations