We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.CO

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Statistics > Computation

Title: Optimizing and accelerating space-time Ripley's K function based on Apache Spark for distributed spatiotemporal point pattern analysis

Abstract: With increasing point of interest (POI) datasets available with fine-grained spatial and temporal attributes, space-time Ripley's K function has been regarded as a powerful approach to analyze spatiotemporal point process. However, space-time Ripley's K function is computationally intensive for point-wise distance comparisons, edge correction and simulations for significance testing. Parallel computing technologies like OpenMP, MPI and CUDA have been leveraged to accelerate the K function, and related experiments have demonstrated the substantial acceleration. Nevertheless, previous works have not extended optimization of Ripley's K function from space dimension to space-time dimension. Without sophisticated spatiotemporal query and partitioning mechanisms, extra computational overhead can be problematic. Meanwhile, these researches were limited by the restricted scalability and relative expensive programming cost of parallel frameworks and impeded their applications for large POI dataset and Ripley's K function variations. This paper presents a distributed computing method to accelerate space-time Ripley's K function upon state-of-the-art distributed computing framework Apache Spark, and four strategies are adopted to simplify calculation procedures and accelerate distributed computing respectively. Based on the optimized method, a web-based visual analytics framework prototype has been developed. Experiments prove the feasibility and time efficiency of the proposed method, and also demonstrate its value on promoting applications of space-time Ripley's K function in ecology, geography, sociology, economics, urban transportation and other fields.
Comments: 35 pages, 23 figures, Future Generation Computer Systems
Subjects: Computation (stat.CO); Computational Geometry (cs.CG); Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS); Software Engineering (cs.SE)
Journal reference: Future Generation Computer Systems, 2020
DOI: 10.1016/j.future.2019.11.036
Cite as: arXiv:1912.04753 [stat.CO]
  (or arXiv:1912.04753v1 [stat.CO] for this version)

Submission history

From: Zhipeng Gui [view email]
[v1] Tue, 10 Dec 2019 15:15:37 GMT (1995kb)

Link back to: arXiv, form interface, contact.