Data Sets & Images

EVVE dataset

Here is the EVent VidEo dataset used in the paper "Event retrieval in large video collections with circulant temporal encoding" (CVPR 2013).

Face Track Annotations Dataset

Track annotations for the dataset used in the paper "Unsupervised Metric Learning for Face Identification in TV Video" (ICCV 2011).

Actom annotations for action detection

Actom annotations for the datasets used in the paper "Actom Sequence Models for Efficient Action Detection" (CVPR 2011).

Labeled Yahoo! News

This data set extends the Labeled Faces in the Wild data set. It consists of news documents composed of images and captions, we used it for face naming and learning face recognition systems with weak supervision in our ECCV 2010 paper and submitted IJCV paper. It is fully annotated for association of faces in the image with names in the caption.

"Web Queries" dataset

The labeled data set collected using image search engine. Contains 71478 images and text meta-data in XML format retrieved by 353 text queries, accompanied with relevance label for each image. This data set was used in our CVPR'10 paper Improving web-image search results using query-relative classifiers.

Web images for multiple query terms

Web images collected from Flickr for 20 object categories and 20 combinations of object categories. This data set was used in Ranking user-annotated images for multiple query terms published in BMVC 2009.

INRIA Features for some data sets

INRIA features for the COREL 5K, IAPR TC-12, ESP GAME, PASCAL VOC 2007 and MIR Flickr data sets, as used in the ICCV 2009 paper on image auto-annotation and keyword-based retrieval and the CVPR 2010 paper on multimodal semi-supervised learning.

Image indexing database

Holidays dataset collected by Hervé Jégou et al. to test image search methods.

Hollywood Human Actions2

Hollywood Human Actions2 dataset. An extended version of our Hollywood Human Action dataset featuring more action classes and samples. The dataset was used in Actions in context published in CVPR'09.

Hollywood Human Actions

Hollywood Human Actions. A video dataset focusing on realistic human actions. Short video samples were retrieved from various popular movies and annotated both manually and automatically. The dataset was used in Learning Realistic Human Actions from Movies published in CVPR'08 paper (oral). The covered set of human actions includes answering a phone, getting out of a car, handshaking, hugging, kissing, sitting down, sitting up and standing up.

INRIA Annotations for Graz-02

INRIA Annotations for Graz-02. A follow-up on the popular natural-scene object category dataset prepared at Graz University of Technology. Original dataset images were re-annotated by a team of human annotators led by Marcin Marszalek, who then used the annotations to perform accurate object localization with shape masks (CVPR 2007 oral). All cars, bikes and people images, annotations and image lists are made available.

Color Names Data Sets

Color Names Data Sets. Two data sets collected for the automatic learning of color names as proposed in Learning Color Names from Real-World Images published in CVPR 2007.

Soccer Team Data Sets

Soccer Team Data Set. Data sets containing out of seven soccer teams. Has been used to evaluate various color descriptors on in Coloring Local Feature Extraction published in ECCV 2006.

Horse Data Set

Horse Dataset (navigate to INRIA horses). A set of horse and non-horse images collected by Frédéric Jurie and Vittorio Ferrari.

INRIA Person Data Set

INRIA Person Dataset. A large set of marked up images of standing or walking people, used to train Navneet Dalal's CVPR 2005 human detector.

INRIA Car Data Set

INRIA Car Dataset. A set of car and non-car images taken in a parking lot nearby INRIA. It was collected by Peter Carbonetto and Gyuri Dorkó and used in the submitted IJCV journal paper Learning to recognize objects with little supervision.

Interest Point Test Sequences

Test images collected by Krystian Mikolajczyk for testing scaled and affine interest point detectors with various types of local image descriptors.