Integrating K-means Clustering with Kernel Density Estimation for the Development of a Conditional Weather Generation Downscaling Model
Abstract
In previous decades, the climate change caused by global warming increases the occurrence frequency of extreme hydrological events. Water supply shortages caused by extreme events create great challenges for water resource management. To evaluate future climate variations, general circulation models (GCMs) are the most wildly known tools which shows possible weather conditions under pre-defined CO2 emission scenarios announced by IPCC. Because the study area of GCMs is the entire earth, the grid sizes of GCMs are much larger than the basin scale. To overcome the gap, a statistic downscaling technique can transform the regional scale weather factors into basin scale precipitations. The statistic downscaling technique can be divided into three categories include transfer function, weather generator and weather type. The first two categories describe the relationships between the weather factors and precipitations respectively based on deterministic algorithms, such as linear or nonlinear regression and ANN, and stochastic approaches, such as Markov chain theory and statistical distributions. In the weather type, the method has ability to cluster weather factors, which are high dimensional and continuous variables, into weather types, which are limited number of discrete states. In this study, the proposed downscaling model integrates the weather type, using the K-means clustering algorithm, and the weather generator, using the kernel density estimation. The study area is Shihmen basin in northern of Taiwan. In this study, the research process contains two steps, a calibration step and a synthesis step. Three sub-steps were used in the calibration step. First, weather factors, such as pressures, humidities and wind speeds, obtained from NCEP and the precipitations observed from rainfall stations were collected for downscaling. Second, the K-means clustering grouped the weather factors into four weather types. Third, the Markov chain transition matrixes and the conditional probability density function (PDF) of precipitations approximated by the kernel density estimation are calculated respectively for each weather types. In the synthesis step, 100 patterns of synthesis data are generated. First, the weather type of the n-th day are determined by the results of K-means clustering. The associated transition matrix and PDF of the weather type were also determined for the usage of the next sub-step in the synthesis process. Second, the precipitation condition, dry or wet, can be synthesized basing on the transition matrix. If the synthesized condition is dry, the quantity of precipitation is zero; otherwise, the quantity should be further determined in the third sub-step. Third, the quantity of the synthesized precipitation is assigned as the random variable of the PDF defined above. The synthesis efficiency compares the gap of the monthly mean curves and monthly standard deviation curves between the historical precipitation data and the 100 patterns of synthesis data.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2011
- Bibcode:
- 2011AGUFMGC23C0964C
- Keywords:
-
- 1637 GLOBAL CHANGE / Regional climate change;
- 1854 HYDROLOGY / Precipitation