Learning Dynamical Human-Joint Affinity for 3D Pose Estimation in Videos

Zhang, Junhao; Wang, Yali; Zhou, Zhipeng; Luan, Tianyu; Wang, Zhe; Qiao, Yu

doi:10.1109/TIP.2021.3109517

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 2109

Change to browse by:

Computer Science > Computer Vision and Pattern Recognition

Title: Learning Dynamical Human-Joint Affinity for 3D Pose Estimation in Videos

Authors: Junhao Zhang, Yali Wang, Zhipeng Zhou, Tianyu Luan, Zhe Wang, Yu Qiao

(Submitted on 15 Sep 2021)

Abstract: Graph Convolution Network (GCN) has been successfully used for 3D human pose estimation in videos. However, it is often built on the fixed human-joint affinity, according to human skeleton. This may reduce adaptation capacity of GCN to tackle complex spatio-temporal pose variations in videos. To alleviate this problem, we propose a novel Dynamical Graph Network (DG-Net), which can dynamically identify human-joint affinity, and estimate 3D pose by adaptively learning spatial/temporal joint relations from videos. Different from traditional graph convolution, we introduce Dynamical Spatial/Temporal Graph convolution (DSG/DTG) to discover spatial/temporal human-joint affinity for each video exemplar, depending on spatial distance/temporal movement similarity between human joints in this video. Hence, they can effectively understand which joints are spatially closer and/or have consistent motion, for reducing depth ambiguity and/or motion uncertainty when lifting 2D pose to 3D pose. We conduct extensive experiments on three popular benchmarks, e.g., Human3.6M, HumanEva-I, and MPI-INF-3DHP, where DG-Net outperforms a number of recent SOTA approaches with fewer input frames and model size.

Comments:	Accepted by IEEE Transactions on Image Processing
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
DOI:	10.1109/TIP.2021.3109517
Cite as:	arXiv:2109.07353 [cs.CV]
	(or arXiv:2109.07353v1 [cs.CV] for this version)

Submission history

From: Junhao Zhang [view email]
[v1] Wed, 15 Sep 2021 15:06:19 GMT (13968kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2109.07353

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: Learning Dynamical Human-Joint Affinity for 3D Pose Estimation in Videos

Submission history