University-1652: A Multi-view Multi-source Benchmark for Drone-based Geo-localization

Zheng, Zhedong; Wei, Yunchao; Yang, Yi

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 2002

Change to browse by:

Computer Science > Computer Vision and Pattern Recognition

Title: University-1652: A Multi-view Multi-source Benchmark for Drone-based Geo-localization

Authors: Zhedong Zheng, Yunchao Wei, Yi Yang

(Submitted on 27 Feb 2020 (v1), last revised 16 Aug 2020 (this version, v2))

Abstract: We consider the problem of cross-view geo-localization. The primary challenge of this task is to learn the robust feature against large viewpoint changes. Existing benchmarks can help, but are limited in the number of viewpoints. Image pairs, containing two viewpoints, e.g., satellite and ground, are usually provided, which may compromise the feature learning. Besides phone cameras and satellites, in this paper, we argue that drones could serve as the third platform to deal with the geo-localization problem. In contrast to the traditional ground-view images, drone-view images meet fewer obstacles, e.g., trees, and could provide a comprehensive view when flying around the target place. To verify the effectiveness of the drone platform, we introduce a new multi-view multi-source benchmark for drone-based geo-localization, named University-1652. University-1652 contains data from three platforms, i.e., synthetic drones, satellites and ground cameras of 1,652 university buildings around the world. To our knowledge, University-1652 is the first drone-based geo-localization dataset and enables two new tasks, i.e., drone-view target localization and drone navigation. As the name implies, drone-view target localization intends to predict the location of the target place via drone-view images. On the other hand, given a satellite-view query image, drone navigation is to drive the drone to the area of interest in the query. We use this dataset to analyze a variety of off-the-shelf CNN features and propose a strong CNN baseline on this challenging dataset. The experiments show that University-1652 helps the model to learn the viewpoint-invariant features and also has good generalization ability in the real-world scenario.

Comments:	accepted by ACM Multimedia 2020
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2002.12186 [cs.CV]
	(or arXiv:2002.12186v2 [cs.CV] for this version)

Submission history

From: Zhedong Zheng [view email]
[v1] Thu, 27 Feb 2020 15:24:15 GMT (6599kb,D)
[v2] Sun, 16 Aug 2020 00:07:39 GMT (22950kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2002.12186

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: University-1652: A Multi-view Multi-source Benchmark for Drone-based Geo-localization

Submission history