References & Citations
Computer Science > Computer Vision and Pattern Recognition
Title: Learning Visual Classifiers using Human-centric Annotations
(Submitted on 22 Dec 2015 (this version), latest version 12 Apr 2016 (v2))
Abstract: Many datasets contain human-centric annotations that are the result of humans applying their own subjective judgements on what to describe and what to ignore. Examples include image tags and keywords found on photo sharing sites, or in datasets containing image captions. In this paper, we explore the use of human-centric annotations for learning image classifiers. Due to human reporting bias, these annotations miss a significant amount of the information present in an image. We propose an algorithm to decouple the human reporting bias from the correct visually grounded labels. Our algorithm provides results that are highly interpretable for reporting "what's in the image" versus "what's worth saying." We show improvements over traditional learning algorithms for both image classification and image captioning, and evaluate the algorithm's efficacy along a variety of metrics and datasets, including MS COCO and Yahoo Flickr 100M.
Submission history
From: Ishan Misra [view email][v1] Tue, 22 Dec 2015 07:28:06 GMT (5337kb,D)
[v2] Tue, 12 Apr 2016 19:58:29 GMT (2324kb,D)
Link back to: arXiv, form interface, contact.