Skip to main content
SHARE
Publication

A Result Data Offloading Service for HPC Centers...

by Henri Monti, Ali Butt, Sudharshan S Vazhkudai
Publication Type
Conference Paper
Publication Date
Conference Name
Petascale Data Storage Workshop, Supercomputing 2007
Conference Location
Reno, Nevada, United States of America
Conference Date
-

Modern High-Performance Computing applications
are consuming and producing an exponentially increasing
amount of data. This increase has lead to a significant
number of resources being dedicated to data staging
in and out of Supercomputing Centers. The typical
approach to staging is a direct transfer of application
data between the center and the application submission
site. Such a direct data transfer approach becomes
problematic, especially for staging-out, as (i) the
data transfer time increases with the size of data, and
may exceed the time allowed by the center's purge policies;
and (ii) the submission site may not be online to
receive the data, thus further increasing the chances for
output data to be purged. In this paper, we argue for
a systematic data staging-out approach that utilizes intermediary
data-holding nodes to quickly offload data
from the center to the intermediaries, thus avoiding the
peril of a purge and addressing the two issues mentioned
above. The intermediary nodes provide temporary data
storage for the staged-out data and maximize the offload
bandwidth by providing multiple data-flow paths from
the center to the submission site. Our initial investigation
shows such a technique to be effective in addressing
the above two issues and providing better QOS guarantees
for data retrieval.