Quantifying landscape-water quality nexus and predicting water quality under alternative planning scenarios using machine learning approaches
Abstract
Investigating the relationships between landscape characteristics and stream water quality is important to inform effective water quality management strategies. One difficulty in quantifying the landscape-water quality nexus is that the non-linear behavior of ecosystem cannot be efficiently modeled by the conventional linear models. Machine learning algorithms are therefore promising to reveal the complicated interactions between landscape characteristics and stream water quality.
The study site is Texas Gulf Region with the drainage area of 171, 000 square miles. 1353 water quality monitoring stations are located in the region where the water quality data including Total Suspended Solid (TSS), Dissolved Oxygen (DO), Total Phosphorous (TP), Nitrate (NO3--N) and E.coli concentration in 2011 were obtained. Landscape attributes and control variables including land cover, landscape metrics, terrain variables, climatic variables, soil variables and population variables were used to predict stream water quality. First, the most influential factors on stream water quality were acquired using lasso regression, and the spatial-varying relationship between the key landscape factors and stream water quality was quantified through Geographically Weighted Regression (GWR). Second, random forest (RF) regression were used to predict stream water quality with all the relevant factors. Third, the trained RF predicting models were applied to evaluate stream water quality under four alternative planning scenarios in The Woodlands, Texas, including sprawled low density development, compact low density development, sprawled high density development and compact high density development. The results showed that the key impact factors on TP concentration were the presence of hydro soil group D, temperature, population, slope, the total area of shrub, the percentages of medium-density development, planted, water, and forest areas. The R2 of the TP GWR model was 0.54, with the model performance better in the coastal suburban areas. Using RF regression, the R2 was 0.65 in the test set to predict TP concentration with 34 landscape characteristics. The compact high density development was recommended to protect healthy water environment as indicated by the scenario analysis.- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2019
- Bibcode:
- 2019AGUFM.H33L2134W
- Keywords:
-
- 1847 Modeling;
- HYDROLOGY;
- 1873 Uncertainty assessment;
- HYDROLOGY;
- 1906 Computational models;
- algorithms;
- INFORMATICS;
- 1942 Machine learning;
- INFORMATICS