LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation

Tak-Wai Hui, Xiaoou Tang, and Chen Change Loy

CUHK-SenseTime Joint Lab, The Chinese University of Hong Kong.

IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2018, Spotlight Presentation

(Left) Examples demonstrate the effectiveness of the proposed components in LiteFlowNet for i) feature warping, ii) cascaded flow inference, and iii) flow regularization. Enabled components are indicated with bold black fonts. (Right) Comparison between LiteFlowNet and FlowNet2.

Abstract

FlowNet2, the state-of-the-art convolutional neural network (CNN) for optical flow estimation, requires over 160M parameters to achieve accurate flow estimation. In this paper we present an alternative network that attains performance on par with FlowNet2 on the challenging Sintel final pass and KITTI benchmarks, while being 30 times smaller in the model size and 1.36 times faster in the running speed. This is made possible by drilling down to architectural details that might have been missed in the current frameworks: (1) We present a more effective flow inference approach at each pyramid level through a lightweight cascaded network. It not only improves flow estimation accuracy through early correction, but also permits seamless incorporation of descriptor matching in our network. (2) We present a novel flow regularization layer to ameliorate the issue of outliers and vague flow boundaries by using a feature-driven local convolution. (3) Our network owns an effective structure for pyramidal feature extraction and embraces feature warping rather than image warping as practiced in FlowNet2.

Network Architecture

The network structure of LiteFlowNet. For the ease of representation, only a 3-level design is shown. NetE yields multi-scale flow fields that each of them is generated by a cascaded flow inference module M:S (in blue color, including a descriptor matching unit M and a sub-pixel refinement unit S) and a regularization module R (in green color). Flow inference and regularization modules correspond to data fidelity and regularization terms in conventional energy minimization methods respectively.

Data fidelity. NetC is a CNN-based feature descriptor that transforms each image into a pyramid of multiscale high-dimensional features. Module M:S in NetE solves for the minimization of feature-space distance between high-level image features. Feature warping (f-warp) layer and cascaded flow inference are proposed for data fidelity.

Regularization. Flow field that is estimated by the data fidelity is fragile to outliers. Module R in NetE regularizes flow field by adapting the regularization kernel to the pyramidal feature generated by NetC, intermediate flow field from the data term, and occlusion probability map. Feature-driven local convolution (f-lcon) layer is proposed for flow regularization.

Demo Video

Download

Citation

@inproceedings{hui18liteflownet,
 author = {Tak-Wai Hui and Xiaoou Tang and Chen Change Loy},
 title = {LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation},
 booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
 month = {June},
 year = {2018}}