We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Statistics > Methodology

Title: Modelling intransitivity in pairwise comparisons with application to baseball data

Abstract: In most commonly used ranking systems, some level of underlying transitivity is assumed. If transitivity exists in a system then information about pairwise comparisons can be translated to other linked pairs. For example, if typically A beats B and B beats C, this could inform us about the expected outcome between A and C. We show that in the seminal Bradley-Terry model knowing the probabilities of A beating B and B beating C completely defines the probability of A beating C, with these probabilities determined by individual skill levels of A, B and C. Users of this model tend not to investigate the validity of this transitive assumption, nor that some skill levels may not be statistically significantly different from each other; the latter leading to false conclusions about rankings. We provide a novel extension to the Bradley-Terry model, which accounts for both of these features: the intransitive relationships between pairs of objects are dealt with through interaction terms that are specific to each pair; and by partitioning the $n$ skills into $A+1\leq n$ distinct clusters, any differences in the objects' skills become significant, given appropriate $A$. With $n$ competitors there are $n(n-1)/2$ interactions, so even in multiple round robin competitions this gives too many parameters to efficiently estimate. Therefore we separately cluster the $n(n-1)/2$ values of intransitivity into $K$ clusters, giving $(A,K)$ estimatable values respectively, typically with $A+K<n$. Using a Bayesian hierarchical model, $(A,K)$ are treated as unknown, and inference is conducted via a reversible jump Markov chain Monte Carlo (RJMCMC) algorithm. The model is shown to have an improved fit out of sample in both simulated data and when applied to American League baseball data.
Comments: 26 pages, 7 figures, 2 tables in the main text. 17 pages in the supplementary material
Subjects: Methodology (stat.ME); Applications (stat.AP); Computation (stat.CO)
Cite as: arXiv:2103.12094 [stat.ME]
  (or arXiv:2103.12094v1 [stat.ME] for this version)

Submission history

From: Harry Spearing [view email]
[v1] Mon, 22 Mar 2021 18:00:19 GMT (768kb,D)

Link back to: arXiv, form interface, contact.