Skip to main content
SHARE
Publication

Tracking Files Using the Kepler Provenance Framework...

by P. Mouallem, Roselyne B Tchoua, Scott A Klasky, Norbert Podhorszki, M. Vouk
Publication Type
Conference Paper
Book Title
Scientific and Statistical Database Management
Publication Date
Page Numbers
273 to 282
Volume
5566
Conference Name
SSDBM'09: 21st International Conference on Scientific and Statistical Database Management
Conference Location
New Orleans, Louisiana, United States of America
Conference Date
-

Workflow Management Systems (WFMS), such as Kepler, are prov-
ing to be an important tool in scientific problem solving. They can automate
and manage complex processes and huge amounts of data produced by petas-
cale simulations. Typically, the produced data need to be properly visualized
and analyzed by scientists in order to achieve the desired scientific goals. Both
run-time and post analysis may benefit from, even require, additional meta-data
– provenance information. One of the challenges in this context is the tracking
of the data files that can be produced in very large numbers during stages of the
workflow, such as visualizations. The Kepler provenance framework collects all
or part of the raw information flowing through the workflow graph. This infor-
mation then needs to be further parsed to extract meta-data of interest. This can
be done through add-on tools and algorithms. We show how to automate track-
ing specific information such as data files locations.