Multimedia Laboratory

CUHK Image Cropping Dataset

This dataset was used in this paper, which presents an approach for automatic image cropping.

Details

CUHK Face Sketch FERET Database (CUFSF)

CUHK Face Sketch FERET Database (CUFSF) is for research on face sketch synthesis and face sketch recognition. It includes 1,194 persons from the FERET database [8]. For each person, there are a face photo with lighting variation and a sketch with shape exaggeration drawn by an artist when viewing this photo.

The dataset is intended for research purposes only and as such cannot be used commercially. In addition, reference must be made to the following publications when this dataset is used in any academic and research reports.

Details

CUHK Face Sketch Database (CUFS)

CUHK Face Sketch database (CUFS) is for research on face sketch synthesis and face sketch recognition. It includes 188 faces from the Chinese University of Hong Kong (CUHK) student database, 123 faces from the AR database [1], and 295 faces from the XM2VTS database [2]. There are 606 faces in total. For each face, there is a sketch drawn by an artist based on a photo taken in a frontal pose, under normal lighting condition, and with a neutral expression.
1. A. M. Martinez, and R. Benavente, “The AR Face Database,” CVC Technical Report #24, June 1998.
2. K. Messer, J. Matas, J. Kittler, J. Luettin, and G. Maitre, “XM2VTSDB: the Extended of M2VTS Database,” in Proceedings of International Conference on Audio- and Video-Based Person Authentication, pp. 72-77, 1999.

Details

CUHK Face Alignment Database

This dataset is presented in our CVPR 2013 paper,
Y. Sun, X. Wang, and X. Tang. Deep Convolutional Network Cascade for Facial Point Detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013

Details

Image Search and Recognition

Search results of 120 queries collected from Google Images and Bing Images with subcategory annotations

Details

MIT Traffic Dataset

MIT traffic dataset includes a traffic video sequence of 90 minutes long. It is recorded by a stationary camera.
It is used for Human Detection and Activity Analysis.

Details

CUHK Square Dataset

CUHK Square dataset includes a traffic video sequence of 60 minutes long.
It is used for Human Detection.

Details

CUHK Occlusion Dataset

CUHK occlusion dataset is includes 1063 images with occluded pedestrians.
It is used for Human Detection with occlusion handling in crowded scenes.

Details

Grand Central Station Dataset

Grand central station dataset includes a video with 50,010 frames and compressed into 1.1 GB AVI file by ffmpeg.
It is used for Scene Understanding and Crowd Analysis.

Details

MIT Trajectory Dataset (Single Camera)

MIT trajectory dataset (single camera) includes 40,453 trajectories obtained in a single camera view from a parking lot scene within five days.
It is used for Activity Analysis and Semantic Region Modeling.

Details

MIT Trajectory Dataset (Multi-Camera)

MIT trajectory dataset (multi-camera) includes four camera views. Trajectories in different camera views have been synchronized.
It is used for Activity Analysis and Scene Modeling.

Details

CUHK Person Re-Identification Dataset

This dataset contains 971 identities from two disjoint camera views. Each identity has two samples per camera view.
It is used for Person Re-identification.

Details

The Comprehensive Cars (CompCars) dataset

The Comprehensive Cars (CompCars) dataset contains data from two scenarios, including images from web-nature and surveillance-nature. The web-nature data contains 163 car makes with 1,716 car models. There are a total of 136,726 images capturing the entire cars and 27,618 images capturing the car parts. The full car images are labeled with bounding boxes and viewpoints. Each car model is labeled with five attributes. The dataset is well prepared for the following computer vision tasks: Fine-grained classification, Attribute prediction, Car model verification.

Details

Web Image Dataset for Event Recognition (WIDER)

WIDER is a dataset for complex event recognition from static images. It contains 61 event categories and around 50574 images annotated with event class labels. We provide a split of 50% for training and 50% for testing.

Details