All results are reported on ImageNet VID validation set.
| Method | mAP | Runtime (fps) |
|---|---|---|
| Per-frame baseline (Faster RCNN) | 74.5 | 7.1 |
| Sparse detection (uniform) + Interpolation | 72.4 | 66 |
| Sparse detection (adaptive) + Interpolation | 73.9 | 58 |
| ST-Lattice (denser) | 79.6 | 20 |
| ST-Lattice (sparser) | 79.0 | 62 |
@inproceedings{STLattice2018CVPR,
author = {Chen, Kai and Wang, Jiaqi and Yang, Shuo and Zhang, Xingcheng and Xiong, Yuanjun
and Loy, Chen Change and Lin, Dahua},
title = {Optimizing Video Object Detection via a Scale-Time Lattice},
booktitle = {CVPR},
year = {2018}
}