Abstract
High availability data storage systems are critical for many
applications as research and business become more
data-driven. Since metadata management is essential to
system availability, multiple metadata services are used to
improve the availability of distributed storage systems.
Past research focused on the active/standby model, where
each active service has at least one redundant idle backup.
However, interruption of service and even some loss of
service state may occur during a fail-over depending on the
used replication technique. In addition, the replication
overhead for multiple metadata services can be very high.
The research in this paper targets the symmetric
active/active replication model, which uses multiple
redundant service nodes running in virtual synchrony. In
this model, service node failures do not cause a fail-over
to a backup and there is no disruption of service or loss
of service state. We further discuss a fast delivery
protocol to reduce the latency of the needed total order
broadcast. Our prototype implementation shows that
metadata service high availability can be achieved with
an acceptable performance trade-off using our symmetric
active/active metadata service solution.