Paper

THE Benchmark: Transferable Representation Learning for Monocular Height Estimation

Generating 3D city models rapidly is crucial for many applications. Monocular height estimation is one of the most efficient and timely ways to obtain large-scale geometric information. However, existing works focus primarily on training and testing models using unbiased datasets, which does not align well with real-world applications. Therefore, we propose a new benchmark dataset to study the transferability of height estimation models in a cross-dataset setting. To this end, we first design and construct a large-scale benchmark dataset for cross-dataset transfer learning on the height estimation task. This benchmark dataset includes a newly proposed large-scale synthetic dataset, a newly collected real-world dataset, and four existing datasets from different cities. Next, a new experimental protocol, few-shot cross-dataset transfer, is designed. Furthermore, in this paper, we propose a scale-deformable convolution module to enhance the window-based Transformer for handling the scale-variation problem in the height estimation task. Experimental results have demonstrated the effectiveness of the proposed methods in the traditional and cross-dataset transfer settings. The datasets and codes are publicly available at https://mediatum.ub.tum.de/1662763 and https://thebenchmarkh.github.io/.

Results in Papers With Code
(↓ scroll down to see all results)