Skip to main content
SHARE
Publication

Filling in the gaps: characterizing spatial, environmental, and temporal coverage of open-access biodiversity data...

by Matthew J Troia, Ryan A Mcmanamay
Publication Type
Journal
Journal Name
Ecology and Evolution
Publication Date
Page Numbers
1 to 16
Volume
6
Issue
14

Primary biodiversity data constitute observations of particular species at given
points in time and space. Open-access electronic databases provide unprecedented
access to these data, but their usefulness in characterizing species distributions
and patterns in biodiversity depend on how complete species
inventories are at a given survey location and how uniformly distributed survey
locations are along dimensions of time, space, and environment. Our aim was
to compare completeness and coverage among three open-access databases representing
ten taxonomic groups (amphibians, birds, freshwater bivalves, crayfish,
freshwater fish, fungi, insects, mammals, plants, and reptiles) in the
contiguous United States. We compiled occurrence records from the Global
Biodiversity Information Facility (GBIF), the North American Breeding Bird
Survey (BBS), and federally administered fish surveys (FFS). We aggregated
occurrence records by 0.1° 9 0.1° grid cells and computed three completeness
metrics to classify each grid cell as well-surveyed or not. Next, we compared
frequency distributions of surveyed grid cells to background environmental
conditions in a GIS and performed Kolmogorov–Smirnov tests to quantify coverage
through time, along two spatial gradients, and along eight environmental
gradients. The three databases contributed >13.6 million reliable occurrence
records distributed among >190,000 grid cells. The percent of well-surveyed
grid cells was substantially lower for GBIF (5.2%) than for systematic surveys
(BBS and FFS; 82.5%). Still, the large number of GBIF occurrence records produced
at least 250 well-surveyed grid cells for six of nine taxonomic groups.
Coverages of systematic surveys were less biased across spatial and environmental
dimensions but were more biased in temporal coverage compared to GBIF
data. GBIF coverages also varied among taxonomic groups, consistent with
commonly recognized geographic, environmental, and institutional sampling
biases. This comprehensive assessment of biodiversity data across the contiguous
United States provides a prioritization scheme to fill in the gaps by contributing
existing occurrence records to the public domain and planning future
surveys.