We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ML

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Machine Learning

Title: Relationship-aware Multivariate Sampling Strategy for Scientific Simulation Data

Abstract: With the increasing computational power of current supercomputers, the size of data produced by scientific simulations is rapidly growing. To reduce the storage footprint and facilitate scalable post-hoc analyses of such scientific data sets, various data reduction/summarization methods have been proposed over the years. Different flavors of sampling algorithms exist to sample the high-resolution scientific data, while preserving important data properties required for subsequent analyses. However, most of these sampling algorithms are designed for univariate data and cater to post-hoc analyses of single variables. In this work, we propose a multivariate sampling strategy which preserves the original variable relationships and enables different multivariate analyses directly on the sampled data. Our proposed strategy utilizes principal component analysis to capture the variance of multivariate data and can be built on top of any existing state-of-the-art sampling algorithms for single variables. In addition, we also propose variants of different data partitioning schemes (regular and irregular) to efficiently model the local multivariate relationships. Using two real-world multivariate data sets, we demonstrate the efficacy of our proposed multivariate sampling strategy with respect to its data reduction capabilities as well as the ease of performing efficient post-hoc multivariate analyses.
Comments: To appear as IEEE Vis 2020 Shortpaper
Subjects: Machine Learning (cs.LG); Graphics (cs.GR); Human-Computer Interaction (cs.HC); Machine Learning (stat.ML)
DOI: 10.1109/VIS47514.2020.00015
Report number: 2020 IEEE Visualization Conference (VIS)
Cite as: arXiv:2008.13306 [cs.LG]
  (or arXiv:2008.13306v1 [cs.LG] for this version)

Submission history

From: Subhashis Hazarika [view email]
[v1] Mon, 31 Aug 2020 00:52:17 GMT (4088kb,D)

Link back to: arXiv, form interface, contact.