Future daily PM10 concentrations prediction by combining regression models and feedforward backpropagation models with principle component analysis (PCA)
Future PM10 concentration prediction is very important because it can help local authorities to enact preventative measures to reduce the impact of air pollution. The aims of this study are to improve prediction of Multiple Linear Regression (MLR) and Feedforward backpropagation (FFBP) by combining them with principle component analysis for predicting future (next day, next two-day and next three-day) PM10 concentration in Negeri Sembilan, Malaysia. Annual hourly observations for PM10 in Negeri Sembilan, Malaysia from January 2003 to December 2010 were selected for predicting PM10 concentration level. Eighty percent of the monitoring records were used for training and twenty percent were used for validation of the models. Three accuracy measures - Prediction Accuracy (PA), Coefficient of Determination (R2) and Index of Agreement (IA), as well as two error measures - Normalized Absolute Error (NAE) and Root Mean Square Error (RMSE) were used to evaluate the performance of the models. Results show that PCA models combined with MLR and PCA with FFBP improved MLR and FFBP models for all three days in advance of predicting PM10 concentration, with reduced errors by as much as 18.1% (PCA-MLR) and 17.68% (PCA-FFBP) for next day, 19.2% (PCA-MLR) and 22.1% (PCA-FFBP) for next two-day and 18.7% (PCA-MLR) and 22.79% (PCA-FFBP) for next three-day predictions. Including PCA improved the accuracy of the models by as much as by 12.9% (PCA-MLR) and 13.3% (PCA-FFBP) for next day, 32.3% (PCA-MLR) and 14.7% (PCA-FFBP) for next two-day and 46.1% (PCA-MLR) and 19.3% (PCA-FFBP) for next three-day predictions.