Is Mapping Necessary for Realistic PointGoal Navigation?

Partsey, Ruslan; Wijmans, Erik; Yokoyama, Naoki; Dobosevych, Oles; Batra, Dhruv; Maksymets, Oleksandr

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 2206

Change to browse by:

Computer Science > Computer Vision and Pattern Recognition

Title: Is Mapping Necessary for Realistic PointGoal Navigation?

Authors: Ruslan Partsey, Erik Wijmans, Naoki Yokoyama, Oles Dobosevych, Dhruv Batra, Oleksandr Maksymets

(Submitted on 2 Jun 2022 (v1), last revised 7 Jun 2022 (this version, v2))

Abstract: Can an autonomous agent navigate in a new environment without building an explicit map?
For the task of PointGoal navigation ('Go to $\Delta x$, $\Delta y$') under idealized settings (no RGB-D and actuation noise, perfect GPS+Compass), the answer is a clear 'yes' - map-less neural models composed of task-agnostic components (CNNs and RNNs) trained with large-scale reinforcement learning achieve 100% Success on a standard dataset (Gibson). However, for PointNav in a realistic setting (RGB-D and actuation noise, no GPS+Compass), this is an open question; one we tackle in this paper. The strongest published result for this task is 71.7% Success.
First, we identify the main (perhaps, only) cause of the drop in performance: the absence of GPS+Compass. An agent with perfect GPS+Compass faced with RGB-D sensing and actuation noise achieves 99.8% Success (Gibson-v2 val). This suggests that (to paraphrase a meme) robust visual odometry is all we need for realistic PointNav; if we can achieve that, we can ignore the sensing and actuation noise.
With that as our operating hypothesis, we scale the dataset and model size, and develop human-annotation-free data-augmentation techniques to train models for visual odometry. We advance the state of art on the Habitat Realistic PointNav Challenge from 71% to 94% Success (+23, 31% relative) and 53% to 74% SPL (+21, 40% relative). While our approach does not saturate or 'solve' this dataset, this strong improvement combined with promising zero-shot sim2real transfer (to a LoCoBot) provides evidence consistent with the hypothesis that explicit mapping may not be necessary for navigation, even in a realistic setting.

Comments:	Corrected typos in the Abstract
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2206.00997 [cs.CV]
	(or arXiv:2206.00997v2 [cs.CV] for this version)

Submission history

From: Ruslan Partsey [view email]
[v1] Thu, 2 Jun 2022 11:37:27 GMT (47713kb,D)
[v2] Tue, 7 Jun 2022 08:19:33 GMT (47713kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2206.00997v2

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: Is Mapping Necessary for Realistic PointGoal Navigation?

Submission history