Massive feature extraction for explaining and foretelling hydroclimatic time series forecastability at the global scale
Statistical analyses and descriptive characterizations are sometimes assumed to be offering information on time series forecastability. Despite the scientific interest suggested by such assumptions, the relationships between descriptive time series features (e.g., temporal dependence, entropy, seasonality, trend and nonlinearity features) and actual time series forecastability (quantified by issuing and assessing forecasts for the past) are scarcely studied and quantified in the literature. In this work, we aim to fill in this gap by investigating such relationships, and the way that they can be exploited for understanding hydroclimatic forecastability. To this end, we follow a systematic framework bringing together a variety of -- mostly new for hydrology -- concepts and methods, including 57 descriptive features. We apply this framework to three global datasets. As these datasets comprise over 13 000 monthly temperature, precipitation and river flow time series from several continents and hydroclimatic regimes, they allow us to provide trustable characterizations and interpretations of 12-month ahead hydroclimatic forecastability at the global scale. We find that this forecastability in terms of Nash-Sutcliffe efficiency is strongly related to several descriptive features. We further (i) show that, if such descriptive information is available for a time series, we can even foretell the quality of its future forecasts with a considerable degree of confidence, and (ii) rank the features according to their efficiency in inferring and foretelling forecastability. Spatial forecastability patterns are also revealed through our experiments. A comprehensive interpretation of such patters through massive feature extraction and feature-based time series clustering is shown to be possible.