Optimizing an Index with Spatiotemporal Patterns to Support GEOSS Clearinghouse

Date

2013-02-18

Authors

Xia, Jizhe

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Big Data becomes increasingly important in almost many scientific domains, especially in geographic science where hundreds to millions of sensors are collecting data of the Earth continuously (Whitehouse News 2012). The data are managed and served through various Geospatial Cyberinfrastructure (GCI) components worldwide, and many GCI components are also developed to help discover and utilize the widely geographically dispersed data. In the Internet Era, users expect to receive responses in seconds for the discovery and it is a big challenge to achieve it with a proper index. For example, the R-tree (Guttman 1984) leverages spatial relationship among features and is widely used in spatial DataBase Management Systems (DBMSs) and different R-tree variants have been proposed to 1) improve data retrieval performance, 2) support temporal indexing, and 3) utilize multiple computers for indexing. However, it is hard to meet the seconds expectation because little research has included spatiotemporal patterns of user queries. Traditionally, user behavior has rarely been considered in a spatial index and only one single index is used to support all users from different regions at different times. I propose a Predefined Multiple Indices Mechanism (PMIM) to support global user queries by predefining different indices for different categories of users who have similar query patterns. Access Possibility R-tree (APR-tree) is proposed to build an index based on spatiotemporal patterns of user queries. The new spatiotemporal indexing strategy provides a potential solution to leverage Big spatial Data indexing and enable seconds response to global users. Using metadata in the GEOSS Clearinghouse as an example, I conducted a series of performance experiments for PMIM implemented using APR-tree. Experiment results indicate that new indexing mechanism outperforms a regular R*-tree.

Description

Keywords

Spatiotemporal, R-tree, Index, Big data, User behavior

Citation