Depth Map Super-Resolution by Deep Multi-Scale Guidance

Tak-Wai Hui1, Chen Change Loy1,2, and Xiaoou Tang1,2

1Department of Informaiton Engineering, The Chinese University of Hong Kong,   

2Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences.

The 14th European Conference on Computer Vision (ECCV) 2016, Amsterdam, The Netherlands

[PDF] [Supplementary Material NEW!] [Poster]

Abstract

Depth boundaries often lose sharpness when upsampling from low-resolution (LR) depth maps especially at large upscaling factors. We present a new method to address the problem of depth map super resolution in which a high-resolution (HR) depth map is inferred from a LR depth map and an additional HR intensity image of the same scene. We propose a Multi-Scale Guided convolutional network (MSG-Net) for depth map super resolution. MSG-Net complements LR depth features with HR intensity features using a multi-scale fusion strategy. Such a multi-scale guidance allows the network to better adapt for upsampling of both fine- and large-scale structures. Specifically, the rich hierarchical HR intensity features at different levels progressively resolve ambiguity in depth map upsampling. Moreover, we employ a high-frequency domain training method to not only reduce training time but also facilitate the fusion of depth and intensity features. With the multiscale guidance, MSG-Net achieves state-of-art performance for depth map upsampling.

Contributions

(1) We propose a new framework to address the problem of depth map upsampling by complementing a LR depth map with the corresponding HR intensity image using a convolutional neural network in a multiscale guidance architecture (MSG-Net). To the best of our knowledge, no prior studies have proposed this idea for CNN before.

(2) With the introduction of multi-scale upsampling architecture, our compact single-image upsampling network (MS-Net) in which no guidance from HR intensity image is present already outperforms most of the state-of-the-art methods requiring guidance from HR intensity image.

(3) We discuss detailed steps to enable both MSG-Net and MS-Net to perform image-wise upsampling and end-to-end training.

Challenges

1. Fine structures in enlarged image are either lost or severely distorted (depending on the scale factor used) in LR image because they cannot be fully represented by the limited spatial resolution.

Fig. 1. Ambiguity in upsampling depth map. (a) Color image. (b) Ground truth. (c) (Enlarged) LR depth map downsampled by a factor of 8. Results for upsampling: (d) SRCNN [1], (e) Our solution without ambiguity problem.

2. Features in intensity images are often over-transferred to the depth image.

Fig. 2. Over-texture transfer in depth map refinement and upsampling using intensity guidance. (a) Color image. (b) Ground truth. (c) Refinement of (b) using (a) by Guided Filtering [2] (r = 4, ε = 0.012). Results of using (a) to guide the 2x upsampling of (b): (d) Ferstl et al. [3], (e) Our solution.

Multi-Scale Guidance Network (MSG-Net)

We propose a new framework to address the problem of depth map upsampling by complementing a LR depth map with the corresponding HR intensity image using a convolutional neural network in a multiscale guidance architecture. Flat intensity patches (regardless of what intensity values they possess) do not contribute much improvement in depth super resolution. Therefore, we complement depth features with the associated intensity features in high-frequency domain

A Special Case: Single-Image Super-Resolution Using Multi-Scale Network (MS-Net)

Removing the (intensity) guidance branch and fusion stages of MSG-Net, it reduces to a compact multi-scale network (MS-Net) for super-resolving images.

References

[1] Dong, C., Loy, C., He, K., Tang, X, Image super-resolution using deep convolutional networks. PAMI 38(2), pp.295–307, 2015

[2] He, K., Sun, J., Tang, X., Guided image filtering. PAMI 35(6), pp.1397–1409, 2013

[3] Ferstl, D., Reinbacher, C., Ranftl, R., R¨uther, M., Bischof, H.: Image guided depth upsampling using anisotropic total generalized variation. ICCV, pp. 993–1000, 2013

Citation

@inproceedings{hui16,
 author = {Tak-Wai Hui and Chen Change Loy and Xiaoou Tang},
 title = {Depth Map Super-Resolution by Deep Multi-Scale Guidance},
 booktitle = {European Conference on Computer Vision (ECCV)},
 page = {353--369},
 year = {2016}}