Upscaling soil organic carbon measurements at the continental scale by multivariate clustering analysis and machine learning
Abstract
Soil represents a large carbon (C) sink, containing approximately twice as much C as in the atmosphere and plants globally. To better understand and model the fate of soil C under changing climate, we need an accurate estimation of soil organic carbon (SOC) stocks under the present climate. However, estimates of SOC stocks at regional and global scales don't match across existing gridded databases, revealing that significant uncertainties remain. To improve SOC estimation in the US, we upscaled site-based SOC measurements to the continental scale using spatial clustering approaches coupled with machine learning models. First, we used multivariate clustering (representativeness analysis) to segment the US at 30 arc second resolution based on environmental covariates (gNATSGO soil properties, WorldClim bioclimatic variables, MODIS biological variables, physiographic variables). We then trained separate random forest model ensembles for each of the 10 clusters identified using soil profile measurements (8171, 5868 for 30 cm and 100 cm, respectively) from the International Soil Carbon Network (ISCN) at 0-30 cm and 0-100 cm depth. Using this novel upscaling method, our estimated SOC for 0-30 cm (58.0 + 2.8 Pg) is larger than the estimates from the Harmonized World Soil Database (HWSD; 53.7 Pg) and SoilGrids (45.7 Pg). However, our estimated SOC for 0-100 cm (119.8+ 7.6 Pg) is slightly higher than HWSD (108.3 Pg) but much lower than SoilGrids (190.4 Pg). Independent validation with soil profile data from National Ecological Observatory Network (NEON; 597 and 227 for 30 cm and 100 cm, respectively) indicated that our estimation outperformed (R2 = 0.42, 0.40 for 30 cm and 100 cm) HWSD (R2 = 0.10, 0.15 for 30 cm and 100 cm) and SoilGrids 2.0 (R2 = 0.25, 0.22 for 30 cm and 100 cm) for both layers. Variable importance analysis indicated that climatic and biological variables are important in all clusters, and different soil properties impact different clusters. For example, cation exchange capacity and soil erosion factor (kfactor) are important in southeast coastal areas, while soil texture (sand, silt, clay percentage) and CaCO3 content play an important role in the Central US. This work has the potential to provide a robust SOC estimate at the continental scale that can inform terrestrial C cycle processes in Earth system models.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2022
- Bibcode:
- 2022AGUFM.B15C..06W