Datasets
CUHK Face Sketch FERET Database (CUFSF) is for research on face sketch synthesis and face sketch recognition. It includes 1,194 persons from the FERET database [8]. For each person, there are a face photo with lighting variation and a sketch with shape exaggeration drawn by an artist when viewing this photo.
The dataset is intended for research purposes only and as such cannot be used commercially. In addition, reference must be made to the following publications when this dataset is used in any academic and research reports.
Details
CUHK Face Sketch database (CUFS) is for research on face sketch synthesis and face sketch recognition. It includes 188 faces from the Chinese University of Hong Kong (CUHK) student database, 123 faces from the AR database [1], and 295 faces from the XM2VTS database [2]. There are 606 faces in total. For each face, there is a sketch drawn by an artist based on a photo taken in a frontal pose, under normal lighting condition, and with a neutral expression.
1. A. M. Martinez, and R. Benavente, “The AR Face Database,” CVC Technical Report #24, June 1998.
2. K. Messer, J. Matas, J. Kittler, J. Luettin, and G. Maitre, “XM2VTSDB: the Extended of M2VTS Database,” in Proceedings of International Conference on Audio- and Video-Based Person Authentication, pp. 72-77, 1999.
The dataset is intended for research purposes only and as such cannot be used commercially. In addition, reference must be made to the following publications when this dataset is used in any academic and research reports.
Details
This dataset is presented in our CVPR 2013 paper,
Y. Sun, X. Wang, and X. Tang. Deep Convolutional Network Cascade for Facial Point Detection. In Proceedings
of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013
The dataset is intended for research purposes only and as such cannot be used commercially. In addition, reference must be made to the following publications when this dataset is used in any academic and research reports.
DetailsSearch results of 120 queries collected from Google Images and Bing Images with subcategory annotations
The dataset is intended for research purposes only and as such cannot be used commercially. In addition, reference must be made to the following publications when this dataset is used in any academic and research reports.
Details
MIT traffic dataset includes a traffic video sequence of 90 minutes long. It is recorded by a stationary camera.
It is used for Human Detection and Activity Analysis.
Details
CUHK Square dataset includes a traffic video sequence of 60 minutes long.
It is used for Human Detection.
Details
CUHK occlusion dataset is includes 1063 images with occluded pedestrians.
It is used for Human Detection with occlusion handling in crowded scenes.
Details
Grand central station dataset includes a video with 50,010 frames and compressed into 1.1 GB AVI file by ffmpeg.
It is used for Scene Understanding and Crowd Analysis.
Details
MIT trajectory dataset (single camera) includes 40,453 trajectories obtained in a single camera view from a parking lot scene within five days.
It is used for Activity Analysis and Semantic Region Modeling.
Details
MIT trajectory dataset (multi-camera) includes four camera views. Trajectories in different camera views have been synchronized.
It is used for Activity Analysis and Scene Modeling.
Details
This dataset contains 971 identities from two disjoint camera views. Each identity has two samples per camera view.
It is used for Person Re-identification.
Details
The Comprehensive Cars (CompCars) dataset contains data from two scenarios, including images from web-nature and surveillance-nature. The web-nature data contains 163 car makes with 1,716 car models. There are a total of 136,726 images capturing the entire cars and 27,618 images capturing the car parts. The full car images are labeled with bounding boxes and viewpoints. Each car model is labeled with five attributes. The dataset is well prepared for the following computer vision tasks: Fine-grained classification, Attribute prediction, Car model verification.
Details
WIDER is a dataset for complex event recognition from static images. It contains 61 event categories and around 50574 images annotated with event class labels. We provide a split of 50% for training and 50% for testing.
Details