Object Detection from Video Tubelets with Convolutional Neural Networks

Kang, Kai; Ouyang, Wanli; Li, Hongsheng; Wang, Xiaogang

doi:10.1109/CVPR.2016.95

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 1604

Change to browse by:

Computer Science > Computer Vision and Pattern Recognition

Title: Object Detection from Video Tubelets with Convolutional Neural Networks

Authors: Kai Kang, Wanli Ouyang, Hongsheng Li, Xiaogang Wang

(Submitted on 14 Apr 2016)

Abstract: Deep Convolution Neural Networks (CNNs) have shown impressive performance in various vision tasks such as image classification, object detection and semantic segmentation. For object detection, particularly in still images, the performance has been significantly increased last year thanks to powerful deep networks (e.g. GoogleNet) and detection frameworks (e.g. Regions with CNN features (R-CNN)). The lately introduced ImageNet task on object detection from video (VID) brings the object detection task into the video domain, in which objects' locations at each frame are required to be annotated with bounding boxes. In this work, we introduce a complete framework for the VID task based on still-image object detection and general object tracking. Their relations and contributions in the VID task are thoroughly studied and evaluated. In addition, a temporal convolution network is proposed to incorporate temporal information to regularize the detection results and shows its effectiveness for the task.

Comments:	Accepted in CVPR 2016 as a Spotlight paper
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Journal reference:	Computer Vision and Pattern Recognition (CVPR), 2016 IEEE Conference on (pp. 817-825)
DOI:	10.1109/CVPR.2016.95
Cite as:	arXiv:1604.04053 [cs.CV]
	(or arXiv:1604.04053v1 [cs.CV] for this version)

Submission history

From: Kai Kang [view email]
[v1] Thu, 14 Apr 2016 07:22:44 GMT (4283kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:1604.04053

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: Object Detection from Video Tubelets with Convolutional Neural Networks

Submission history