Relation Modeling in Spatio-Temporal Action Localization

Feng, Yutong; Jiang, Jianwen; Huang, Ziyuan; Qing, Zhiwu; Wang, Xiang; Zhang, Shiwei; Tang, Mingqian; Gao, Yue

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 2106

Change to browse by:

Computer Science > Computer Vision and Pattern Recognition

Title: Relation Modeling in Spatio-Temporal Action Localization

Authors: Yutong Feng, Jianwen Jiang, Ziyuan Huang, Zhiwu Qing, Xiang Wang, Shiwei Zhang, Mingqian Tang, Yue Gao

(Submitted on 15 Jun 2021 (v1), last revised 16 Jun 2021 (this version, v2))

Abstract: This paper presents our solution to the AVA-Kinetics Crossover Challenge of ActivityNet workshop at CVPR 2021. Our solution utilizes multiple types of relation modeling methods for spatio-temporal action detection and adopts a training strategy to integrate multiple relation modeling in end-to-end training over the two large-scale video datasets. Learning with memory bank and finetuning for long-tailed distribution are also investigated to further improve the performance. In this paper, we detail the implementations of our solution and provide experiments results and corresponding discussions. We finally achieve 40.67 mAP on the test set of AVA-Kinetics.

Comments:	CVPR 2021 ActivityNet Workshop Report
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2106.08061 [cs.CV]
	(or arXiv:2106.08061v2 [cs.CV] for this version)

Submission history

From: Yutong Feng [view email]
[v1] Tue, 15 Jun 2021 11:40:18 GMT (1127kb,D)
[v2] Wed, 16 Jun 2021 07:00:12 GMT (1127kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2106.08061

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: Relation Modeling in Spatio-Temporal Action Localization

Submission history