Temporal Fusion Network for Temporal Action Localization:Submission to ActivityNet Challenge 2020 (Task E)

Qing, Zhiwu; Wang, Xiang; Sang, Yongpeng; Gao, Changxin; Zhang, Shiwei; Sang, Nong

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 2006

Change to browse by:

Computer Science > Computer Vision and Pattern Recognition

Title: Temporal Fusion Network for Temporal Action Localization:Submission to ActivityNet Challenge 2020 (Task E)

Authors: Zhiwu Qing, Xiang Wang, Yongpeng Sang, Changxin Gao, Shiwei Zhang, Nong Sang

(Submitted on 13 Jun 2020)

Abstract: This technical report analyzes a temporal action localization method we used in the HACS competition which is hosted in Activitynet Challenge 2020.The goal of our task is to locate the start time and end time of the action in the untrimmed video, and predict action category.Firstly, we utilize the video-level feature information to train multiple video-level action classification models. In this way, we can get the category of action in the video.Secondly, we focus on generating high quality temporal proposals.For this purpose, we apply BMN to generate a large number of proposals to obtain high recall rates. We then refine these proposals by employing a cascade structure network called Refine Network, which can predict position offset and new IOU under the supervision of ground truth.To make the proposals more accurate, we use bidirectional LSTM, Nonlocal and Transformer to capture temporal relationships between local features of each proposal and global features of the video data.Finally, by fusing the results of multiple models, our method obtains 40.55% on the validation set and 40.53% on the test set in terms of mAP, and achieves Rank 1 in this challenge.

Comments:	To appear on CVPR 2020 HACS Workshop (Rank 1st)
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2006.07520 [cs.CV]
	(or arXiv:2006.07520v1 [cs.CV] for this version)

Submission history

From: Qing Zhiwu [view email]
[v1] Sat, 13 Jun 2020 00:33:00 GMT (133kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2006.07520

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: Temporal Fusion Network for Temporal Action Localization:Submission to ActivityNet Challenge 2020 (Task E)

Submission history