We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CV

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computer Vision and Pattern Recognition

Title: Correct and Certify: A New Approach to Self-Supervised 3D-Object Perception

Abstract: We consider an object pose estimation and model fitting problem, where - given a partial point cloud of an object - the goal is to estimate the object pose by fitting a CAD model to the sensor data. We solve this problem by combining (i) a semantic keypoint-based pose estimation model, (ii) a novel self-supervised training approach, and (iii) a certification procedure, that not only verifies whether the output produced by the model is correct or not, but also flags uniqueness of the produced solution. The semantic keypoint detector model is initially trained in simulation and does not perform well on real-data due to the domain gap. Our self-supervised training procedure uses a corrector and a certification module to improve the detector. The corrector module corrects the detected keypoints to compensate for the domain gap, and is implemented as a declarative layer, for which we develop a simple differentiation rule. The certification module declares whether the corrected output produced by the model is certifiable (i.e. correct) or not. At each iteration, the approach optimizes over the loss induced only by the certifiable input-output pairs. As training progresses, we see that the fraction of outputs that are certifiable increases, eventually reaching near $100\%$ in many cases. We also introduce the notion of strong certifiability wherein the model can determine if the predicted object model fit is unique or not. The detected semantic keypoints help us implement this in the forward pass. We conduct extensive experiments to evaluate the performance of the corrector, the certification, and the proposed self-supervised training using the ShapeNet and YCB datasets, and show the proposed approach achieves performance comparable to fully supervised baselines while not requiring pose or keypoint supervision on real data.
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Robotics (cs.RO)
Cite as: arXiv:2206.11215 [cs.CV]
  (or arXiv:2206.11215v1 [cs.CV] for this version)

Submission history

From: Rajat Talak [view email]
[v1] Wed, 22 Jun 2022 17:06:39 GMT (2859kb,D)
[v2] Thu, 23 Jun 2022 16:56:49 GMT (2812kb,D)
[v3] Tue, 24 Jan 2023 19:34:40 GMT (4102kb,D)
[v4] Fri, 28 Apr 2023 19:47:34 GMT (5285kb,D)

Link back to: arXiv, form interface, contact.