Skip to main content
SHARE
Publication

Diagnosing Anomalous Network Performance with Confidence...

by Bradley W Settlemyer, Stephen W Hodson, Jeffery A Kuehn, Stephen W Poole
Publication Type
Conference Paper
Publication Date
Page Numbers
612 to 613
Volume
11
Conference Name
CCGrid 2011: The International Symposium on Cluster, Cloud, and Grid Computing
Conference Location
Newport Beach, California, United States of America
Conference Sponsor
IEEE
Conference Date
-

Variability in network performance is a major obstacle in effectively analyzing the throughput of modern high performance computer systems. High performance interconnection networks offer excellent best-case network latencies; however, highly parallel applications running on parallel machines
typically require consistently high levels of performance to adequately leverage the massive amounts of available computing power. Performance analysts have usually quantified network performance using traditional summary statistics that assume the observational data is sampled from a normal distribution. In our examinations of network performance, we have found this method of analysis often provides too little data to understand anomalous network performance. In particular, we examine a multi-modal
performance scenario encountered with an Infiniband interconnection network and we explore the performance repeatability on the custom Cray SeaStar2 interconnection network after a set of software and driver updates.