The performance of a single-image monocular pedestrian detector has achieved the state-of-the-art level. However, it has been observed that when applied to a specific scene, its performance may drop significantly due to mismatch between the source dataset used to train the detector and samples in the target scene.
To tackle this issue, we propose an automatically adaptive pedestrian detector that re-train itself from samples collected from the new traffic scene. As more target samples are included in the training set, the mismatch problem will eventually be solved.
New samples from the target scene are collected by setting an easy detection criterion. Their labels are estimated by a basket of factors, including motion, appearance and semantic region. The combination of these factors is the confidence score, a soft-label probability value.
We propose a unified objective function based on linear SVM. The new formulation has two-fold highlights. First, it incorporates the confidence score instead of traditional hard labels. Second, it jointly optimizes the slack penalty and the label propagation functions. Finally, a new scene specific model is obtained from the optmization.
Datasets used in this work are available below:
Liblinear Prior A Liblinear extension to solve soft-label SVMs.
If you use our codes or dataset, it is recommended to cite the following papers: