Skip to main content
SHARE
Publication

Parallel k-Means Clustering for Quantitative Ecoregion Delineation Using Large Data Sets...

by Jitendra Kumar, Richard T Mills, Forrest M Hoffman, William Hargrove
Publication Type
Conference Paper
Journal Name
Procedia Computer Science
Publication Date
Page Numbers
1602 to 1611
Volume
4
Conference Name
International Conference on Computational Science, ICCS 2011
Conference Location
Singapore, Singapore
Conference Date
-

Identification of geographic ecoregions has long been of interest to
environmental scientists and ecologists for identifying
regions of similar ecological and environmental conditions. Such
classifications are important for predicting suitable species ranges, for
stratification of ecological samples, and to help prioritize habitat
preservation and remediation efforts. Hargrove and Hoffman (1999, 2009)
have developed geographical spatio-temporal clustering
algorithms and codes and have successfully applied them to a variety of
environmental science domains, including ecological regionalization;
environmental monitoring network design; analysis of satellite-, airborne-,
and ground-based remote sensing, and climate model-model and
model-measurement intercomparison. With the advances in state-of-the-art
satellite remote sensing and climate models, observations and model
outputs are available at increasingly high spatial and temporal resolutions.
Long time series of these high resolution datasets are extremely large in
size and growing. Analysis and knowledge extraction from these large
datasets are not just algorithmic and ecological problems, but also pose a
complex computational problem. This paper focuses on the development of
a massively parallel multivariate geographical spatio-temporal clustering
code for analysis of very large datasets using tens of thousands processors
on one of the fastest supercomputers in the world.