Abstract
We prototype a storage system that provides the access performance of a well-endowed GridFTP deployment (e.g., using a cluster and a parallel file-system) at the modest cost of single desktop.
To this end, we integrate GridFTP and a combination of dedicated but low-bandwidth (thus cheap) storage nodes and scavenged storage from LAN-connected desktops that participate intermittently to the storage pool. The main advantage of this setup is that it alleviates the server I/O access bottleneck. Additionally, the specific data access pattern of GridFTP, that is, the fact that data accesses are mostly sequential, allows for optimizations that result in a high-performance storage system. To provide data durability when facing intermittent participation of the storage resources, we use an intelligent replication scheme that minimizes the volume of internal transfers that impact the low-bandwidth storage nodes.