Facial Landmark Detection by Deep Multi-task Learning

Zhanpeng Zhang, Ping Luo, Chen Change Loy, Xiaoou Tang

Department of Information Engineering, The Chinese University of Hong Kong

[] Matlab version of TCDCN face alignment tool and MAFL dataset is available here (07/01/2016).

TCDCN face alignment tool added. The executable file can be downloaded from here (13/12/2014).

Live demo added. The executable file can be downloaded from here (28/10/2014).

Multi-Task Facial Landmark (MTFL) dataset added.

Introduction

Facial landmark detection of face alignment has long been impeded by the problems of occlusion and pose variation. Instead of treating the detection task as a single and independent problem, we investigate the possibility of improving detection robustness through multi-task learning. Specifically, we wish to optimize facial landmark detection together with heterogeneous but subtly correlated tasks, e.g., head pose estimation and facial attribute inference. This is non-trivial since different tasks have different learning difficulties and convergence rates. To address this problem, we formulate a novel tasks-constrained deep model, with task-wise early stopping to facilitate learning convergence. Extensive evaluations show that the proposed task-constrained learning (i) outperforms existing methods, especially in dealing with faces with severe occlusion and pose variation, and (ii) reduces model complexity drastically compared to the state-of-the-art method based on cascaded deep model.

We also extend this method to handle more landmark points (68 points instead of 5 major facial points) without either redesigning the deep model or involving significant increase in run time cost. This is made possible by transferring the learned 5-point model to the desired facial landmark configuration, through model fine-tuning with dense landmark annotations. Our new model achieves the state-of-the-art result on the 300-W benchmark dataset (mean error of 9.15% on the challenging IBUG subset).

Dataset and Code

1. Live demo: [download]

   The executable file for the live demo. Please see the readme.txt in the downloaded package.
2. Multi-Task Facial Landmark (MTFL) dataset: [download]

   This dataset contains 12,995 face images which are annotated with (1) five facial landmarks, (2) attributes of gender, smiling, wearing glasses, and head pose.
3. Multi-Attribute Facial Landmark (MAFL) dataset: [download]
   This dataset contains 20,000 face images which are annotated with (1) five facial landmarks, (2) 40 facial attributes.
4. TCDCN face alignment tool:

   It takes an face image as input and output the locations of 68 facial landmarks. Win32 Binary [download] Matlab [download]

Citation

Zhanpeng Zhang, Ping Luo, Chen Change Loy, Xiaoou Tang. Facial Landmark Detection by Deep Multi-task Learning, in Proceedings of European Conference on Computer Vision (ECCV), 2014

Zhanpeng Zhang, Ping Luo, Chen Change Loy, Xiaoou Tang. Learning Deep Representation for Face Alignment with Auxiliary Attributes. to appear in IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI).