Investigating Water Quality Data Using Principal Component Analysis and Granger Causality
Abstract
Water quality information is essential to protect lives and manage water resources effectively. This requires state-of-the-art collection procedures and in-depth analysis skills. The main goal of this work is to assess the efficiency of techniques, such as Principal Component Analysis (PCA) and Granger causality, for analyzing water quality indicators and recommending appropriate tools for water quality management. Daily water quality data (water temperature, dissolved oxygen, turbidity, specific conductivity) are collected for 10 watersheds across Virginia, District of Columbia, and Maryland from the United States Geological Survey, together with climate information from the National Oceanic and Atmospheric Administration and watershed characteristics from the United States Department of Agriculture. First, an extensive pre-processing is applied to the collected data to remove any cyclic pattern and ensure stationarity. A PCA is then performed to assess linear relationships among the water quality indicators and hydrometeorological variables, such as precipitation and streamflow. Lastly, the Granger causality test is performed to assess whether one variable at time t-lag causes another variable at time t. The percent of times that any of the variables Granger caused one particular water quality indicator (defined as the Granger Causality Factor; GCF) is calculated for each watershed and investigated as a function of basin size, using three lags (1, 2, and 3 days). The PCA results confirm what expected, that is, dissolved oxygen and water temperature are highly and negatively correlated, while precipitation, streamflow, and turbidity are positively correlated. The Granger causality analysis shows large variability in terms of GCF among all watersheds. Moreover, GCF is the highest in small watersheds, which are characterized by the largest percentage of urban area and soil type C (i.e., slow infiltration). In contrast, GCF is the lowest in medium watersheds where urban areas and soil type C are limited. PCA and Granger causality have the potential to improve our understanding of water quality and the role of major hydroclimatic variables in affecting their behavior, which is crucial to develop effective methods for water quality protection.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2020
- Bibcode:
- 2020AGUFMH116.0012Z
- Keywords:
-
- 0470 Nutrients and nutrient cycling;
- BIOGEOSCIENCES;
- 1831 Groundwater quality;
- HYDROLOGY;
- 1871 Surface water quality;
- HYDROLOGY;
- 1879 Watershed;
- HYDROLOGY