Skip to main content
SHARE
Publication

An evaluation of the CORAL interconnects...

Publication Type
Conference Paper
Book Title
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
Publication Date
Page Number
39
Publisher Location
United States of America
Conference Name
International Conference for High Performance Computing, Networking, Storage and Analysis
Conference Location
Denver, Colorado, United States of America
Conference Sponsor
ACM
Conference Date
-

The US Department of Energy deployed the Summit and Sierra supercomputers with the latest state-of-the-art network interconnect technology in 2018 and both systems entered production in 2019. In this paper, we provide an in-depth assessment of the systems' network interconnects that are based on Enhanced Data Rate (EDR) 100 Gb/s Mellanox InfiniBand. Both systems use second-generation EDR Host Channel Adapters (HCAs) and switches with several new features such as Adaptive Routing (AR), switch-based collectives, and HCA-based tag matching. Although based on the same components, Summit's network is "non-blocking" (i.e., a fully provisioned Clos network) and Sierra's network has a 2:1 taper between the racks and aggregation switches. We evaluate the two systems' interconnects using traditional communication benchmarks as well as production applications. We find that the new Adaptive Routing dramatically improves performance but the other new features still need improvement.