We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.SI

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Social and Information Networks

Title: A Weighted Correlation Index for Rankings with Ties

Abstract: Understanding the correlation between two different scores for the same set of items is a common problem in information retrieval, and the most commonly used statistics that quantifies this correlation is Kendall's $\tau$. However, the standard definition fails to capture that discordances between items with high rank are more important than those between items with low rank. Recently, a new measure of correlation based on average precision has been proposed to solve this problem, but like many alternative proposals in the literature it assumes that there are no ties in the scores. This is a major deficiency in a number of contexts, and in particular while comparing centrality scores on large graphs, as the obvious baseline, indegree, has a very large number of ties in web and social graphs. We propose to extend Kendall's definition in a natural way to take into account weights in the presence of ties. We prove a number of interesting mathematical properties of our generalization and describe an $O(n\log n)$ algorithm for its computation. We also validate the usefulness of our weighted measure of correlation using experimental data.
Subjects: Social and Information Networks (cs.SI); Information Retrieval (cs.IR)
Cite as: arXiv:1404.3325 [cs.SI]
  (or arXiv:1404.3325v3 [cs.SI] for this version)

Submission history

From: Sebastiano Vigna [view email]
[v1] Sat, 12 Apr 2014 23:20:34 GMT (102kb,D)
[v2] Mon, 28 Apr 2014 06:43:40 GMT (102kb,D)
[v3] Fri, 31 Oct 2014 08:30:56 GMT (102kb,D)

Link back to: arXiv, form interface, contact.