Skip to main content
SHARE
Publication

Evaluation of Machine Learning Approaches to Estimate Aerosol Mixing State Metrics in Atmospheric Models...

by Zhonghua Zheng, Nicole Riemer, Matthew West, Valentine G Anantharaj
Publication Type
ORNL Report
Publication Date

Aerosol mixing state describes how aerosol compositions are distributed among atmospheric aerosol particles in a population. Oversimplified assumptions of aerosol mixing state in atmospheric modeling can introduce errors in estimations of weather and climate-relevant aerosol microphysical properties. A more comprehensive representation of the aerosol mixing state can be achieved in principle with a Particle-resolved Monte Carlo (PartMC) model but at added computational cost that may be prohibitive for direct invocation in operational numerical weather prediction or multi-year climate simulations.

The aim of our research is to explore the machine learning (ML) methodologies for estimating aerosol mixing state metrics, which we define here in three different ways: with respect to (a) hygroscopicity; (b) optical properties; and (c) chemical species abundance. We adopted a data-driven approach, leveraging deep learning and statistical learning techniques, to take advantage of massive PartMC model simulations. First, we performed particle-resolved simulations by PartMC to create a series of scenarios considering a range of global environmental conditions. Each scenario consists of aerosol populations and corresponding mixing state metrics. The gas concentration, aerosol mass concentration, environmental variables and mixing state metrics of each population constitute the datasets for machine learning implementations.

We have adopted and evaluated various configurations of machine learning methodologies in this investigation, embracing deep learning, Extreme Gradient Boosting (XGBoost) algorithm, and the ensemble approaches. After a rigorous model selection process, we identified an appropriate model to derive estimates of aerosol mixing state metrics. We used the computational and data resources of the Oak Ridge Leadership Computing Facility (OLCF) and the ORNL Compute and Data Environment for Science (CADES). The NVIDIA DGX-1 hardware was used for the prototyping of the ML models.

Our approach has allowed us to gain a new understanding of how machine learning methodologies can be applied to improve the representation of aerosol mixing state in atmospheric models and benefit the atmospheric research community. Next, we plan to extend our research and methodology to quantify some of the aerosol-related uncertainties in the E3SM Atmospheric Model (EAM) Version 1.