Rethinking Localization Map: Towards Accurate Object Perception with Self-Enhancement Maps

Zhang, Xiaolin; Wei, Yunchao; Yang, Yi; Wu, Fei

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 2006

Change to browse by:

Computer Science > Computer Vision and Pattern Recognition

Title: Rethinking Localization Map: Towards Accurate Object Perception with Self-Enhancement Maps

Authors: Xiaolin Zhang, Yunchao Wei, Yi Yang, Fei Wu

(Submitted on 9 Jun 2020 (v1), last revised 13 Jun 2020 (this version, v2))

Abstract: Recently, remarkable progress has been made in weakly supervised object localization (WSOL) to promote object localization maps. The common practice of evaluating these maps applies an indirect and coarse way, i.e., obtaining tight bounding boxes which can cover high-activation regions and calculating intersection-over-union (IoU) scores between the predicted and ground-truth boxes. This measurement can evaluate the ability of localization maps to some extent, but we argue that the maps should be measured directly and delicately, i.e., comparing the maps with the ground-truth object masks pixel-wisely. To fulfill the direct evaluation, we annotate pixel-level object masks on the ILSVRC validation set. We propose to use IoU-Threshold curves for evaluating the real quality of localization maps. Beyond the amended evaluation metric and annotated object masks, this work also introduces a novel self-enhancement method to harvest accurate object localization maps and object boundaries with only category labels as supervision. We propose a two-stage approach to generate the localization maps by simply comparing the similarity of point-wise features between the high-activation and the rest pixels. Based on the predicted localization maps, we explore to estimate object boundaries on a very large dataset. A hard-negative suppression loss is proposed for obtaining fine boundaries. We conduct extensive experiments on the ILSVRC and CUB benchmarks. In particular, the proposed Self-Enhancement Maps achieve the state-of-the-art localization accuracy of 54.88% on ILSVRC. The code and the annotated masks are released at this https URL

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2006.05220 [cs.CV]
	(or arXiv:2006.05220v2 [cs.CV] for this version)

Submission history

From: Xiaolin Zhang [view email]
[v1] Tue, 9 Jun 2020 12:35:55 GMT (6914kb,D)
[v2] Sat, 13 Jun 2020 04:13:23 GMT (6886kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2006.05220

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: Rethinking Localization Map: Towards Accurate Object Perception with Self-Enhancement Maps

Submission history