Skip to main content
SHARE
Project

Statistical Evaluation & Classification of Large Data Sets

Project Details

Principal Investigator

Problem Statement

Nuclear forensic data usually includes nuclide assays and other data obtained from sample measurements.  Frequently data streams contain missing entries, due to errors in experiments, data handling, or other factors which are likely unknown.  Methods to handle such missing data include: 1) ignoring all records that have incomplete data, 2) imputation—filling in missing entries using regression or other methods, and 3) special formulation of algorithms to handle records with missing data.  While the latter are usually preferred, they require additional development effort and special rules for applications. This project is intended to develop algorithms to allow the Quantile Comparisons method of classification to handle records with missing data.

Technical Approach

•Methods for handling missing data generally will be investigated, and approaches relevant to nuclear forensics will be identified. •Algorithm development will be conducted for the Quantile Comparisons method of classification, to identify ways to incorporate points with missing data naturally into the calculations. •Comparisons of different approaches will evaluate how well each one incorporates incomplete data points and effectiveness of the resulting classification.

Benefit

If successful, the project would allow quantile comparisons classification to be used in a variety of forensics applications when missing entries appear in raw data streams.  This would facilitate evaluations of samples and addressing questions of origins, processing history, and intended use.

Contact

Distinguished R&D Staff
Charles F Weber