Skip to main content
SHARE
Publication

IRIS: A Performance-Portable Framework for Cross-Platform Heterogeneous Computing

by Jungwon Kim, Seyong Lee, Beau E Johnston, Jeffrey S Vetter
Publication Type
Journal
Journal Name
IEEE Transactions on Parallel and Distributed Systems
Publication Date
Page Numbers
1796 to 1809
Volume
35
Issue
10

From edge to exascale, computer architectures are becoming more heterogeneous and complex. The systems typically have fat nodes, with multicore CPUs and multiple hardware accelerators such as GPUs, FPGAs, and DSPs. This complexity is causing a crisis in programming systems and performance portability. Several programming systems are working to address these challenges, but the increasing architectural diversity is forcing software stacks and applications to be specialized for each architecture. As we show, all of these approaches critically depend on their software framework for discovery, execution, scheduling, and data orchestration. To address this challenge, we believe that a more agile and proactive software framework is essential to increase performance portability and improve user productivity. To this end, we have designed and implemented IRIS: a performance-portable framework for cross-platform heterogeneous computing. IRIS can discover available resources, manage multiple diverse programming platforms (e.g., CUDA, Hexagon, HIP, Level Zero, OpenCL, OpenMP) simultaneously in the same execution, respect data dependencies, orchestrate data movement proactively, and provide for user-configurable scheduling. To simplify data movement, IRIS introduces a shared virtual device memory with relaxed consistency among different heterogeneous devices. IRIS also adds an automatic kernel workload partitioning technique using the polyhedral model so that it can resize kernels for a wide range of devices. Our evaluation on three architectures, ranging from Qualcomm Snapdragon to a Summit supercomputer node, shows that IRIS improves portability across a wide range of diverse heterogeneous architectures with negligible overhead.