Statistical Downscaling and Bias Correction of Surface Meteorological Variables on Reanalysis Datasets of Imdaa Using Machine Learning Approaches
Abstract
The initial and boundary conditions produced using global and regional reanalysis datasets in the numerical mesoscale models lead to biases in the hindcasting and forecasting of surface meteorological variables such as surface temperature, surface pressure, relative humidity, wind speed (above 10 m), and many more. Since these meteorological biases vary depending on the topographic, spatial resolution, and synoptic conditions of the region, therefore in the current study, we are using NCMRWF Global reanalysis (NGFS) dataset with a spatial resolution of 25 kilometers at six hourly availability from the year 1999 to 2021, on India provided by the National Centre for Medium-Range Weather Forecasting, India. These datasets are statistically downscaled to 12 kilometers at six hours using a machine learning (ML) model, then validated using a reanalysis dataset of 12-kilometer from the Indian Monsoon Data Assimilation and Analysis, averaged over six hours (IMDAA). Eight different machine learning regression models, such as Ridge Regression (RR), Random Forest (RF), K-Nearest Neighbors (KNN), AdaBoost Regression (ABR), Generalized Linear Model (GLM), Extremely Randomized Tree (ERT), Deep Neural Network (DNN), and Gradient Boosting Machine (GBM) are used to statistically downscale surface meteorological dataset over 55 observation stations located across India. The study focuses on winter months (October to February) starting from 2005 to 2019 for model calibration and the winter months of 2019 to 2020 as validation dataset. We observed that Gradient Boosting Machine (GBM) offered the best statistical downscaling performance among various models. GBM model provides R2 Score of 0.85 and the lowest RMSE of 4.02 °C for surface air temperature and all meteorological variables compared to other ML models. We have applied four types of bias correction techniques, such as basic and modified quantile mapping, scaled distribution, and gamma mapping, to enhance the forecasting of the GBM model after downscaling the meteorological variables to 12 km. It was noted that GBM with scaled distribution mapping has improved R2 Score to 0.91 and the RMSE to 3.07 °C. Therefore, we can conclude that advanced ML models with bias correction techniques can enhance forecasting of surface meteorological variables over the Indian region.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2022
- Bibcode:
- 2022AGUFM.A45A..71A