CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds

Weng, Yijia; Wang, He; Zhou, Qiang; Qin, Yuzhe; Duan, Yueqi; Fan, Qingnan; Chen, Baoquan; Su, Hao; Guibas, Leonidas J.

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 2104

Change to browse by:

Computer Science > Computer Vision and Pattern Recognition

Title: CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds

Authors: Yijia Weng, He Wang, Qiang Zhou, Yuzhe Qin, Yueqi Duan, Qingnan Fan, Baoquan Chen, Hao Su, Leonidas J. Guibas

(Submitted on 8 Apr 2021 (v1), last revised 21 Oct 2021 (this version, v2))

Abstract: In this work, we tackle the problem of category-level online pose tracking of objects from point cloud sequences. For the first time, we propose a unified framework that can handle 9DoF pose tracking for novel rigid object instances as well as per-part pose tracking for articulated objects from known categories. Here the 9DoF pose, comprising 6D pose and 3D size, is equivalent to a 3D amodal bounding box representation with free 6D pose. Given the depth point cloud at the current frame and the estimated pose from the last frame, our novel end-to-end pipeline learns to accurately update the pose. Our pipeline is composed of three modules: 1) a pose canonicalization module that normalizes the pose of the input depth point cloud; 2) RotationNet, a module that directly regresses small interframe delta rotations; and 3) CoordinateNet, a module that predicts the normalized coordinates and segmentation, enabling analytical computation of the 3D size and translation. Leveraging the small pose regime in the pose-canonicalized point clouds, our method integrates the best of both worlds by combining dense coordinate prediction and direct rotation regression, thus yielding an end-to-end differentiable pipeline optimized for 9DoF pose accuracy (without using non-differentiable RANSAC). Our extensive experiments demonstrate that our method achieves new state-of-the-art performance on category-level rigid object pose (NOCS-REAL275) and articulated object pose benchmarks (SAPIEN, BMVC) at the fastest FPS ~12.

Comments:	ICCV 2021 (Oral). Project page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2104.03437 [cs.CV]
	(or arXiv:2104.03437v2 [cs.CV] for this version)

Submission history

From: Yijia Weng [view email]
[v1] Thu, 8 Apr 2021 00:14:58 GMT (45034kb,D)
[v2] Thu, 21 Oct 2021 09:49:46 GMT (7298kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2104.03437

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds

Submission history