Mining User spatiotemporal Behavior in Geospatial Cyberinfrastructure --using GEOSS Clearinghouse as an example
Abstract
Big Data becomes increasingly important in almost all scientific domains, especially in geoscience where hundreds to millions of sensors are collecting data of the Earth continuously (Whitehouse News 2012). With the explosive growth of data, various Geospatial Cyberinfrastructure (GCI) (Yang et al. 2010) components are developed to manage geospatial resources and provide data access for the public. These GCIs are accessed by different users intensively on a daily basis. However, little research has been done to analyze the spatiotemporal patterns of user behavior, which could be critical to the management of Big Data and the operation of GCIs (Yang et al. 2011). For example, the spatiotemporal distribution of end users helps us better arrange and locate GCI computing facilities. A better indexing and caching mechanism could be developed based on the spatiotemporal pattern of user queries. In this paper, we use GEOSS Clearinghouse as an example to investigate spatiotemporal patterns of user behavior in GCIs. The investigation results show that user behaviors are heterogeneous but with patterns across space and time. Identified patterns include (1) the high access frequency regions; (2) local interests; (3) periodical accesses and rush hours; (4) spiking access. Based on identified patterns, this presentation reports several solutions to better support the operation of the GEOSS Clearinghouse and other GCIs. Keywords: Big Data, EarthCube, CyberGIS, Spatiotemporal Thinking and Computing, Data Mining, User Behavior Reference: Fayyad, U. M., Piatetsky-Shapiro, G., Smyth, P., & Uthurusamy, R. 1996. Advances in knowledge discovery and data mining. Whitehouse. 2012. Obama administration unveils 'BIG DATA' initiative: announces $200 million in new R&D investments. Whitehouse. Retrieved from http://www.whitehouse.gov/sites/default/files/microsites/ostp/big_data_press_release_final_2.pdf [Accessed 14 June 2013] Yang, C., Wu, H., Huang, Q., Li, Z., & Li, J. 2011. Using spatial principles to optimize distributed computing for enabling the physical science discoveries. Proceedings of the National Academy of Sciences, 108(14), 5498-5503. doi:10.1073/pnas.0909315108 Yang, C., Raskin, R., Goodchild, M., & Gahegan, M. 2010. Geospatial Cyberinfrastructure: Past, present and future. Computers, Environment and Urban Systems, 34(4), 264-277. doi:10.1016/j.compenvurbsys.2010.04.001
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2013
- Bibcode:
- 2013AGUFMIN53B1567X
- Keywords:
-
- 1928 INFORMATICS GIS science;
- 1914 INFORMATICS Data mining