We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

math

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Mathematics > Statistics Theory

Title: Kernel Two-Sample Tests in High Dimension: Interplay Between Moment Discrepancy and Dimension-and-Sample Orders

Abstract: Motivated by the increasing use of kernel-based metrics for high-dimensional and large-scale data, we study the asymptotic behavior of kernel two-sample tests when the dimension and sample sizes both diverge to infinity. We focus on the maximum mean discrepancy (MMD) using isotropic kernel, including MMD with the Gaussian kernel and the Laplace kernel, and the energy distance as special cases. We derive asymptotic expansions of the kernel two-sample statistics, based on which we establish the central limit theorem (CLT) under both the null hypothesis and the local and fixed alternatives. The new non-null CLT results allow us to perform asymptotic exact power analysis, which reveals a delicate interplay between the moment discrepancy that can be detected by the kernel two-sample tests and the dimension-and-sample orders. The asymptotic theory is further corroborated through numerical studies.
Comments: Revised version (results refined in Section 3.4, contributions highlighted, more discussions provided, additional simulations conducted, and minor changes made throughout)
Subjects: Statistics Theory (math.ST); Machine Learning (stat.ML)
DOI: 10.1093/biomet/asac049
Cite as: arXiv:2201.00073 [math.ST]
  (or arXiv:2201.00073v2 [math.ST] for this version)

Submission history

From: Jian Yan [view email]
[v1] Fri, 31 Dec 2021 23:12:44 GMT (74kb,D)
[v2] Thu, 12 May 2022 03:29:07 GMT (122kb)

Link back to: arXiv, form interface, contact.