The dataset for evaluating human pose estimation in video sequences, introduced in our CVPR'14 paper Mixing Body-Part Sequences for Human Pose Estimation, is available on the project page.
Here is the EVent VidEo dataset used in the paper "Event retrieval in large video collections with circulant temporal encoding" (CVPR 2013).
Track annotations for the dataset used in the paper "Unsupervised Metric Learning for Face Identification in TV Video" (ICCV 2011).
Actom annotations for the datasets used in the paper "Actom Sequence Models for Efficient Action Detection" (CVPR 2011).
This data set extends the Labeled Faces in the Wild data set. It consists of news documents composed of images and captions, we used it for face naming and learning face recognition systems with weak supervision in our ECCV 2010 paper and submitted IJCV paper. It is fully annotated for association of faces in the image with names in the caption.
The labeled data set collected using image search engine. Contains 71478 images and text meta-data in XML format retrieved by 353 text queries, accompanied with relevance label for each image. This data set was used in our CVPR'10 paper Improving web-image search results using query-relative classifiers.
Web images collected from Flickr for 20 object categories and 20 combinations of object categories. This data set was used in Ranking user-annotated images for multiple query terms published in BMVC 2009.
INRIA features for the COREL 5K, IAPR TC-12, ESP GAME, PASCAL VOC 2007 and MIR Flickr data sets, as used in the ICCV 2009 paper on image auto-annotation and keyword-based retrieval and the CVPR 2010 paper on multimodal semi-supervised learning.
Holidays dataset collected by Hervé Jégou et al. to test image search methods.
Hollywood Human Actions. A video dataset focusing on realistic human actions. Short video samples were retrieved from various popular movies and annotated both manually and automatically. The dataset was used in Learning Realistic Human Actions from Movies published in CVPR'08 paper (oral). The covered set of human actions includes answering a phone, getting out of a car, handshaking, hugging, kissing, sitting down, sitting up and standing up.
INRIA Annotations for Graz-02. A follow-up on the popular natural-scene object category dataset prepared at Graz University of Technology. Original dataset images were re-annotated by a team of human annotators led by Marcin Marszalek, who then used the annotations to perform accurate object localization with shape masks (CVPR 2007 oral). All cars, bikes and people images, annotations and image lists are made available.
INRIA Car Dataset. A set of car and non-car images taken in a parking lot nearby INRIA. It was collected by Peter Carbonetto and Gyuri Dorkó and used in the submitted IJCV journal paper Learning to recognize objects with little supervision.
Test images collected by Krystian Mikolajczyk for testing scaled and affine interest point detectors with various types of local image descriptors.