A two-point machine learning method for the spatial prediction of soil pollution
Abstract
Heavy metal soil pollution is a worldwide problem. It is affected by many natural and human factors through heterogeneous relationships. Accurate prediction at unobserved locations using a limited number of observations hence remains a challenge. This study proposes a two-point machine learning method to fully utilize the information in spatial neighbors and high-dimensional covariates to improve prediction accuracy. It models the difference between pairs of points, predicts concentration differences between observation points and unobserved points, and uses those for neighbor selection. This supervised learning method integrates both spatial autocorrelation and property similarity. Method performance, illustrated in a case study of soil Pb, confirms that our method can greatly improve prediction accuracy for different sample sizes. The improvements vary with the sample size and have a decreasing trend as the sample size increases. Compared with ordinary kriging, kriging with external drift, random forest, and random forest-based regression kriging, the average improvements on RMSE are 1.49, 0.95, 0.93 and 0.62 respectively, and on MAE are 1.29, 1.17, 0.87 and 0.65 respectively. In the future, the method may be applied to the spatial prediction of other variables of the earth system, while the supervised learning method can be adjusted to new applications.
- Publication:
-
International Journal of Applied Earth Observation and Geoinformation
- Pub Date:
- April 2022
- DOI:
- 10.1016/j.jag.2022.102742
- Bibcode:
- 2022IJAEO.10802742G
- Keywords:
-
- Two point machine learning;
- Spatial prediction;
- Spatial heterogeneity;
- Soil heavy metal