Combining Machine Learning and Numerical Simulation for Near-Real-Time High-Resolution PM2.5 Concentration Forecast

Combining Machine Learning and Numerical Simulation for Near-Real-Time High-Resolution PM2.5 Concentration Forecast

Forecasting ambient PM2.5 concentrations with spatiotemporal coverage is key to alerting decision-makers of pollution episodes and preventing detrimental public exposure, especially in regions with limited ground air monitoring stations. The existing methods either rely on chemical transport models (CTMs) to forecast spatial distribution of PM2.5 with nontrivial uncertainty or statistical algorithms to forecast PM2.5 concentration time-series at air monitoring locations without continuous spatial coverage. In this study, we developed a PM2.5 forecast framework by combining the robust Random Forest algorithm with a publicly accessible global CTM forecast product - NASA's Goddard Earth Observing System "Composition Forecasting" (GEOS-CF), providing spatiotemporally continuous PM2.5 concentration forecasts for the next 5 days at a 1 km spatial resolution. Our forecast experiment was conducted by selecting a region in Central China (Fenwei Plain) as an example. The forecast for the next 2 days had overall validation R2 of 0.76 and 0.64, respectively; the R2 was around 0.5 for the following 3 forecast days. Spatial cross-validation showed similar validation metrics. Our forecast model, with validation normalized mean bias close to 0, substantially reduced the large biases in GEOS-CF. The proposed framework requires minimal computational resources compared to running CTMs at urban scales, enabling near-real-time PM2.5 forecast in resource-restricted environments.

Publication:: AGU Fall Meeting Abstracts
Pub Date:: December 2022
Bibcode:: 2022AGUFM.A15O1439B

NASA/ADS

Combining Machine Learning and Numerical Simulation for Near-Real-Time High-Resolution PM2.5 Concentration Forecast

Abstract