Skip to main content
SHARE
Publication

Virtual Topologies for Scalable Resource Management and Contention Attenuation in a Global Address Space Model on the Cray XT...

by Jeffrey S Vetter, Weikuan Yu
Publication Type
Conference Paper
Publication Date
Conference Name
Principles and Practice of Parallel Programming
Conference Location
San Antonio, Texas, United States of America
Conference Date

Global Address Space (GAS) programming models enable a convenient, shared-memory style address- ing model, and support completely asynchronous data movement. Their underlying runtime systems face criti- cal challenges in (1) scalably managing resources (such as memory for communication buffers), and (2) grace- fully handling unpredictable communication patterns and any associated contention. In this research, we investigate these challenges for a popular GAS run- time library, Aggregate Remote Memory Copy Interface (ARMCI) on large-scale Cray XT5 systems. We repre- sent the management of communication resources as directed graphs, and propose two new scalable virtual topologies, Meshed FCGs (MFCG) and Cubic FCGs (CFCG), for scalable resource management and con- tention attenuation. To ensure deadlock-free communi- cation in these multi-dimensional topologies, we design and develop lowest dimension first forwarding to sup- port fully- or partially-populated MFCG and CFCG on any number of nodes. We have extensively evaluated the benefits of these virtual topologies on the petascale Jaguar Cray XT5 system at Oak Ridge National Labo- ratory. Our experimental results demonstrate MFCG as the most suitable virtual topology because its benefits in resource management, contention mitigation, and the resulting benefit to scientific applications.