Skip to main content

Network-friendly one-sided communication through multinode cooperation on petascale cray xt5 systems...

by Vinod Tipparaju, Xinyu Que, Weikuan Yu, Jeffrey S Vetter
Publication Type
Conference Paper
Publication Date
Conference Name
IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing
Conference Location
Newport Beach, California, United States of America
Conference Sponsor
Conference Date

ne-sided communication is important to enable
asynchronous communication and data movement for
Global Address Space (GAS) programming models.
Such communication is typically realized through direct
messages between initiator and target processes. For
petascale systems with 10,000s of nodes and 100,000s
of cores, these direct messages require dedicated communication
buffers and/or channels, which can lead to
significant scalability challenges for GAS programming
models. In this paper, we describe a network-friendly
communication model, multinode cooperation, to enable
indirect one-sided communication. Compute nodes
work together to handle one-sided requests through (1)
request forwarding in which one node can intercept a
request and forward it to a target node, and (2) request
aggregation in which one node can aggregate many
requests to a target node. We have implemented multinode
cooperation for a popular GAS runtime library,
Aggregate Remote Memory Copy Interface (ARMCI).
Our experimental results on a large-scale Cray XT5
system demonstrate that, multinode cooperation is able
to greatly increase the memory scalability by reducing
the number of communication buffers. In addition,
multinode cooperation improves the resiliency of GAS
runtime system to network contention. Furthermore,
multinode cooperation can benefit the performance of
scientific applications. In one case, it reduces the total
execution time of an NWChem application by 52%.