We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Ancillary-file links:

Ancillary files (details):

Current browse context:

stat.AP

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Statistics > Applications

Title: The Bayesian Sorting Hat: A Decision-Theoretic Approach to Size-Constrained Clustering

Abstract: Size-constrained clustering (SCC) refers to the dual problem of using observations to determine latent cluster structure while at the same time assigning observations to the unknown clusters subject to an analyst defined constraint on cluster sizes. While several approaches have been proposed, SCC remains a difficult problem due to the combinatorial dependency between observations introduced by the size-constraints. Here we reformulate SCC as a decision problem and introduce a novel loss function to capture various types of size constraints. As opposed to prior work, our approach is uniquely suited to situations in which size constraints reflect and external limitation or desire rather than an internal feature of the data generation process. To demonstrate our approach, we develop a Bayesian mixture model for clustering respondents using both simulated and real categorical survey data. Our motivation for the development of this decision theoretic approach to SCC was to determine optimal team assignments for a Harry Potter themed scavenger hunt based on categorical survey data from participants.
Subjects: Applications (stat.AP); Methodology (stat.ME)
Cite as: arXiv:1710.06047 [stat.AP]
  (or arXiv:1710.06047v1 [stat.AP] for this version)

Submission history

From: Justin Silverman [view email]
[v1] Tue, 17 Oct 2017 01:32:28 GMT (641kb,AD)

Link back to: arXiv, form interface, contact.