New THOTH website available here

Data Sets & Images

AVA dataset

A video dataset of spatio-temporally localized atomic visual actions, introduced in this paper.
Available here.

DALY dataset

The dataset for spatio-temporal action detection, introduced in "Towards Weakly-Supervised Action Localization" (arXiv), is available here

Rome Patches

The dataset introduced in the Patch-CKN paper is available here.

Action Movie Franchises

A dataset, introduced in the arXiv paper Beat-Event Detection in Action Movie Franchises, is available here.

Video alignment datasets

The datasets with temporally aligned video clips of a Climbing session and a Madonna concert, introduced in the arXiv paper Circulant temporal encoding for video retrieval and temporal alignment are available here.

YouTube Motion Boundaries dataset

A dataset with motion boundaries annotations, introduced in "Learning to Detect Motion Boundaries" (CVPR'15), is available here.

MED-Summaries dataset

A video summarization dataset, introduced in "Category-specific video summarization" (ECCV'14) is available here.

Poses in the Wild Dataset

The dataset for evaluating human pose estimation in video sequences, introduced in our CVPR'14 paper Mixing Body-Part Sequences for Human Pose Estimation, is available on the project page.

EVVE dataset

Here is the EVent VidEo dataset used in the paper "Event retrieval in large video collections with circulant temporal encoding" (CVPR 2013).

Youtube-Objects dataset

This dataset is composed of videos collected from YouTube by querying for the names of 10 object classes. It contains between 9 and 24 videos for each class, and can be downloaded from here.

Face Track Annotations Dataset

Track annotations for the dataset used in the paper "Unsupervised Metric Learning for Face Identification in TV Video" (ICCV 2011).

Actom annotations for action detection

Actom annotations for the datasets used in the paper "Actom Sequence Models for Efficient Action Detection" (CVPR 2011).

Labeled Yahoo! News

This data set extends the Labeled Faces in the Wild data set. It consists of news documents composed of images and captions, we used it for face naming and learning face recognition systems with weak supervision in our ECCV 2010 paper and submitted IJCV paper. It is fully annotated for association of faces in the image with names in the caption.

"Web Queries" dataset

The labeled data set collected using image search engine. Contains 71478 images and text meta-data in XML format retrieved by 353 text queries, accompanied with relevance label for each image. This data set was used in our CVPR'10 paper Improving web-image search results using query-relative classifiers.

Web images for multiple query terms

Web images collected from Flickr for 20 object categories and 20 combinations of object categories. This data set was used in Ranking user-annotated images for multiple query terms published in BMVC 2009.

INRIA Features for some data sets

INRIA features for the COREL 5K, IAPR TC-12, ESP GAME, PASCAL VOC 2007 and MIR Flickr data sets, as used in the ICCV 2009 paper on image auto-annotation and keyword-based retrieval and the CVPR 2010 paper on multimodal semi-supervised learning.

Image indexing database

Holidays dataset collected by Hervé Jégou et al. to test image search methods.

Hollywood Human Actions2

Hollywood Human Actions2 dataset. An extended version of our Hollywood Human Action dataset featuring more action classes and samples. The dataset was used in Actions in context published in CVPR'09.

Hollywood Human Actions

Hollywood Human Actions. A video dataset focusing on realistic human actions. Short video samples were retrieved from various popular movies and annotated both manually and automatically. The dataset was used in Learning Realistic Human Actions from Movies published in CVPR'08 paper (oral). The covered set of human actions includes answering a phone, getting out of a car, handshaking, hugging, kissing, sitting down, sitting up and standing up.

INRIA Annotations for Graz-02

INRIA Annotations for Graz-02. A follow-up on the popular natural-scene object category dataset prepared at Graz University of Technology. Original dataset images were re-annotated by a team of human annotators led by Marcin Marszalek, who then used the annotations to perform accurate object localization with shape masks (CVPR 2007 oral). All cars, bikes and people images, annotations and image lists are made available.

Color Names Data Sets

Color Names Data Sets. Two data sets collected for the automatic learning of color names as proposed in Learning Color Names from Real-World Images published in CVPR 2007.

Soccer Team Data Sets

Soccer Team Data Set. Data sets containing out of seven soccer teams. Has been used to evaluate various color descriptors on in Coloring Local Feature Extraction published in ECCV 2006.

Horse Data Set

Horse Dataset (navigate to INRIA horses). A set of horse and non-horse images collected by Frédéric Jurie and Vittorio Ferrari.

INRIA Person Data Set

INRIA Person Dataset. A large set of marked up images of standing or walking people, used to train Navneet Dalal's CVPR 2005 human detector.

INRIA Car Data Set

INRIA Car Dataset. A set of car and non-car images taken in a parking lot nearby INRIA. It was collected by Peter Carbonetto and Gyuri Dorkó and used in the submitted IJCV journal paper Learning to recognize objects with little supervision.

Interest Point Test Sequences

Test images collected by Krystian Mikolajczyk for testing scaled and affine interest point detectors with various types of local image descriptors.