We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CV

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computer Vision and Pattern Recognition

Title: Rethinking Localization Map: Towards Accurate Object Perception with Self-Enhancement Maps

Abstract: Recently, remarkable progress has been made in weakly supervised object localization (WSOL) to promote object localization maps. The common practice of evaluating these maps applies an indirect and coarse way, i.e., obtaining tight bounding boxes which can cover high-activation regions and calculating intersection-over-union (IoU) scores between the predicted and ground-truth boxes. This measurement can evaluate the ability of localization maps to some extent, but we argue that the maps should be measured directly and delicately, i.e., comparing the maps with the ground-truth object masks pixel-wisely. To fulfill the direct evaluation, we annotate pixel-level object masks on the ILSVRC validation set. We propose to use IoU-Threshold curves for evaluating the real quality of localization maps. Beyond the amended evaluation metric and annotated object masks, this work also introduces a novel self-enhancement method to harvest accurate object localization maps and object boundaries with only category labels as supervision. We propose a two-stage approach to generate the localization maps by simply comparing the similarity of point-wise features between the high-activation and the rest pixels. Based on the predicted localization maps, we explore to estimate object boundaries on a very large dataset. A hard-negative suppression loss is proposed for obtaining fine boundaries. We conduct extensive experiments on the ILSVRC and CUB benchmarks. In particular, the proposed Self-Enhancement Maps achieve the state-of-the-art localization accuracy of 54.88% on ILSVRC. The code and the annotated masks are released at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Cite as: arXiv:2006.05220 [cs.CV]
  (or arXiv:2006.05220v2 [cs.CV] for this version)

Submission history

From: Xiaolin Zhang [view email]
[v1] Tue, 9 Jun 2020 12:35:55 GMT (6914kb,D)
[v2] Sat, 13 Jun 2020 04:13:23 GMT (6886kb,D)

Link back to: arXiv, form interface, contact.