We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ML

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Statistics > Machine Learning

Title: Exact Exponent in Optimal Rates for Crowdsourcing

Abstract: In many machine learning applications, crowdsourcing has become the primary means for label collection. In this paper, we study the optimal error rate for aggregating labels provided by a set of non-expert workers. Under the classic Dawid-Skene model, we establish matching upper and lower bounds with an exact exponent $mI(\pi)$ in which $m$ is the number of workers and $I(\pi)$ the average Chernoff information that characterizes the workers' collective ability. Such an exact characterization of the error exponent allows us to state a precise sample size requirement $m>\frac{1}{I(\pi)}\log\frac{1}{\epsilon}$ in order to achieve an $\epsilon$ misclassification error. In addition, our results imply the optimality of various EM algorithms for crowdsourcing initialized by consistent estimators.
Comments: To appear in the Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA, 2016
Subjects: Machine Learning (stat.ML); Statistics Theory (math.ST)
Cite as: arXiv:1605.07696 [stat.ML]
  (or arXiv:1605.07696v2 [stat.ML] for this version)

Submission history

From: Yu Lu [view email]
[v1] Wed, 25 May 2016 01:16:06 GMT (16kb)
[v2] Thu, 26 May 2016 00:43:49 GMT (16kb)

Link back to: arXiv, form interface, contact.