Optimizing Access to Big Earth Observation Data with Spatiotemporal Patterns -- An Example with the GEOSS Clearinghouse




Xia, Jizhe

Journal Title

Journal ISSN

Volume Title



Big Data becomes increasingly important in almost all scientific domains, especially in geographical studies where millions to billions of sensors are collecting data of the Earth continuously. Recognizing the importance of managing the Big Earth observation Data, Group on Earth Observations selected the Global Earth Observation System of Systems Clearinghouse (CLH) to harvest, manage and share Earth observation metadata. Building a CLH to support global operation is very challenging, because it is essential for CLH to effectively manage and index Big Earth observation Data, provide accurate data service evaluation, and execute these services using fast provision computing resources to different space and time locations to support dynamic global user access. Although various optimization mechanisms (e.g., index, workload balancing, service model, cache) have been proposed, few approaches optimize the Earth observation data access with the spatiotemporal patterns of the data utilization. This dissertation investigates a variety of spatiotemporal optimizations to better support Big Earth observation Data access using the CLH as an example. Specifically, the objectives are the following: (1) develop a new indexing mechanism to accelerate Big Data access. The new indexing mechanism integrates the spatiotemporal user access patterns into traditional index structures. The experiment result showed that the new index yields 9-20% performance gain for the data access compared to a classic R*-tree index; (2) develop a new service performance model to improve the service evaluation accuracy. The new service model collects globally distributed service information with cloud services and volunteers, and integrates the spatiotemporal service characteristics to provide evaluation end users at different space-time locations. The proposed spatiotemporal service model yields 3-18% accuracy improvements gains, thereby helping end users better choose service for data access; and (3) develop a cloud computing adoption framework to better support global user access and spiking access. The cloud framework automatically provisions and delivers computing resources for different data access tasks with spatiotemporal computing workloads, and globally deploys system instances to different regions. The experiment result showed that the cloud framework helps the CLH achieve about 10 seconds’ performance gains for global and spiking user access. The significance of this research is that it provides a potential solution for optimizing access to Big Earth observation data using spatiotemporal data utilization patterns, thereby better supporting various Big Data related studies with faster data access.



Geographic information science and geodesy, Computer science, Geography, Big Data, Cloud Computing, GEOSS, Global Access, Spatial Index, Spatiotemporal Optimization