We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ME

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Statistics > Methodology

Title: Generalized k-Means in GLMs with Applications to the Outbreak of COVID-19 in the United States

Abstract: Generalized $k$-means can be incorporated with any similarity or dissimilarity measure for clustering. By choosing the dissimilarity measure as the well known likelihood ratio or $F$-statistic, this work proposes a method based on generalized $k$-means to group statistical models. Given the number of clusters $k$, the method is established under hypothesis tests between statistical models. If $k$ is unknown, then the method can be combined with GIC to automatically select the best $k$ for clustering. The article investigates both AIC and BIC as the special cases. Theoretical and simulation results show that the number of clusters can be identified by BIC but not AIC. The resulting method for GLMs is used to group the state-level time series patterns for the outbreak of COVID-19 in the United States. A further study shows that the statistical models between the clusters are significantly different from each other. This study confirms the result given by the proposed method based on generalized $k$-means.
Subjects: Methodology (stat.ME)
MSC classes: 62H30, 62J12
Cite as: arXiv:2008.03838 [stat.ME]
  (or arXiv:2008.03838v1 [stat.ME] for this version)

Submission history

From: Tonglin Zhang [view email]
[v1] Sun, 9 Aug 2020 23:31:31 GMT (178kb)

Link back to: arXiv, form interface, contact.