OadTR: Online Action Detection with Transformers

Wang, Xiang; Zhang, Shiwei; Qing, Zhiwu; Shao, Yuanjie; Zuo, Zhengrong; Gao, Changxin; Sang, Nong

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 2106

Change to browse by:

Computer Science > Computer Vision and Pattern Recognition

Title: OadTR: Online Action Detection with Transformers

Authors: Xiang Wang, Shiwei Zhang, Zhiwu Qing, Yuanjie Shao, Zhengrong Zuo, Changxin Gao, Nong Sang

(Submitted on 21 Jun 2021)

Abstract: Most recent approaches for online action detection tend to apply Recurrent Neural Network (RNN) to capture long-range temporal structure. However, RNN suffers from non-parallelism and gradient vanishing, hence it is hard to be optimized. In this paper, we propose a new encoder-decoder framework based on Transformers, named OadTR, to tackle these problems. The encoder attached with a task token aims to capture the relationships and global interactions between historical observations. The decoder extracts auxiliary information by aggregating anticipated future clip representations. Therefore, OadTR can recognize current actions by encoding historical information and predicting future context simultaneously. We extensively evaluate the proposed OadTR on three challenging datasets: HDD, TVSeries, and THUMOS14. The experimental results show that OadTR achieves higher training and inference speeds than current RNN based approaches, and significantly outperforms the state-of-the-art methods in terms of both mAP and mcAP. Code is available at this https URL

Comments:	Code is available at this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2106.11149 [cs.CV]
	(or arXiv:2106.11149v1 [cs.CV] for this version)

Submission history

From: Xiang Wang [view email]
[v1] Mon, 21 Jun 2021 14:39:35 GMT (1476kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2106.11149

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: OadTR: Online Action Detection with Transformers

Submission history