Temporal Context Transformers for Multi-Horizon Prediction of Solute Concentration Responses
Abstract
Solute transport is often studied by analyzing the response of solute concentration to changes in discharge. Instead of a descriptive classification of the concentration-discharge relationship, we propose a deep learning model for predictive forecasting of solute concentration responses to changes in discharge. We train multi-horizon forecasting models that accurately predict far into the future representing complex stream flow processes. We provide a comparison of the efficacy of general-purpose different machine learning models such as Long short-term Memory networks (LSTM) (Hochreiter, 1997) or the Transformer architecture (Vaswani, 2017) and a new Temporal Context model. The models are used to automatically analyze high frequency water-quality data sets obtained at a watershed observatory in New Hampshire (USA). Existing architectures have known issues. LSTMs often fail to incorporate long term information because of technical issues in neural network training. Transformer models are based on the ``attention mechanism'', which identifies other data points that exhibit a similar response as the current time point, and incorporates their observed responses in the forecasting prediction. The problem is that Transformers only consider individual data points, which can be misleading in time series data from sensor networks. For example, similar discharge values can be associated with different solute concentrations (hysteresis). While multiple layers in an architecture can potentially remedy this, we propose a simpler and more effective approach. We propose the Temporal Context Transformer model, a new deep learning model that represents information from the temporal context (a sequence of time points) in a latent vector space. This representation feeds into the similarity function of the attention mechanism to generate more resilient solute concentration predictions based on the discharge values. The optimal size of the context as well as its latent representation is trained end-to-end to obtain the most accurately forecasted solute concentrations. Using separate datasets for training and evaluation, we empirically demonstrate that our temporal context learning approach significantly outperforms other models across several sites in our dataset.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2021
- Bibcode:
- 2021AGUFM.H25A1063K