We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: Improving Human-Labeled Data through Dynamic Automatic Conflict Resolution

Abstract: This paper develops and implements a scalable methodology for (a) estimating the noisiness of labels produced by a typical crowdsourcing semantic annotation task, and (b) reducing the resulting error of the labeling process by as much as 20-30% in comparison to other common labeling strategies. Importantly, this new approach to the labeling process, which we name Dynamic Automatic Conflict Resolution (DACR), does not require a ground truth dataset and is instead based on inter-project annotation inconsistencies. This makes DACR not only more accurate but also available to a broad range of labeling tasks. In what follows we present results from a text classification task performed at scale for a commercial personal assistant, and evaluate the inherent ambiguity uncovered by this annotation strategy as compared to other common labeling strategies.
Comments: Conference Paper at COLING 2020: this https URL
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2012.04169 [cs.CL]
  (or arXiv:2012.04169v1 [cs.CL] for this version)

Submission history

From: Qiwei Sun [view email]
[v1] Tue, 8 Dec 2020 02:22:09 GMT (142kb,D)

Link back to: arXiv, form interface, contact.