HODOR: High-level Object Descriptors for Object Re-segmentation in Video Learned from Static Images

Athar, Ali; Luiten, Jonathon; Hermans, Alexander; Ramanan, Deva; Leibe, Bastian

doi:10.1109/CVPR52688.2022.00303

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 2112

Computer Science > Computer Vision and Pattern Recognition

Title: HODOR: High-level Object Descriptors for Object Re-segmentation in Video Learned from Static Images

Authors: Ali Athar, Jonathon Luiten, Alexander Hermans, Deva Ramanan, Bastian Leibe

(Submitted on 16 Dec 2021 (v1), last revised 15 Jul 2022 (this version, v2))

Abstract: Existing state-of-the-art methods for Video Object Segmentation (VOS) learn low-level pixel-to-pixel correspondences between frames to propagate object masks across video. This requires a large amount of densely annotated video data, which is costly to annotate, and largely redundant since frames within a video are highly correlated. In light of this, we propose HODOR: a novel method that tackles VOS by effectively leveraging annotated static images for understanding object appearance and scene context. We encode object instances and scene information from an image frame into robust high-level descriptors which can then be used to re-segment those objects in different frames. As a result, HODOR achieves state-of-the-art performance on the DAVIS and YouTube-VOS benchmarks compared to existing methods trained without video annotations. Without any architectural modification, HODOR can also learn from video context around single annotated video frames by utilizing cyclic consistency, whereas other methods rely on dense, temporally consistent annotations. Source code is available at: this https URL

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
ACM classes:	I.4.6; I.4.8; I.4.10
DOI:	10.1109/CVPR52688.2022.00303
Cite as:	arXiv:2112.09131 [cs.CV]
	(or arXiv:2112.09131v2 [cs.CV] for this version)

Submission history

From: Ali Athar [view email]
[v1] Thu, 16 Dec 2021 18:59:53 GMT (16813kb,D)
[v2] Fri, 15 Jul 2022 13:15:16 GMT (13436kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2112.09131v2

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: HODOR: High-level Object Descriptors for Object Re-segmentation in Video Learned from Static Images

Submission history