All results are reported on ImageNet VID validation set.
Method | mAP | Runtime (fps) |
---|---|---|
Per-frame baseline (Faster RCNN) | 74.5 | 7.1 |
Sparse detection (uniform) + Interpolation | 72.4 | 66 |
Sparse detection (adaptive) + Interpolation | 73.9 | 58 |
ST-Lattice (denser) | 79.6 | 20 |
ST-Lattice (sparser) | 79.0 | 62 |
@inproceedings{STLattice2018CVPR, author = {Chen, Kai and Wang, Jiaqi and Yang, Shuo and Zhang, Xingcheng and Xiong, Yuanjun and Loy, Chen Change and Lin, Dahua}, title = {Optimizing Video Object Detection via a Scale-Time Lattice}, booktitle = {CVPR}, year = {2018} }