State of the art the Holidays dataset

This page sums up the mAP results we are aware of on the Holidays dataset. We report only mAP performance, which is not the only relevant performance measure. There are many reasons why another method could be better: memory usage and indexing / search speed, and various other constraints.

Not all methods use the same information from the dataset. We used codes to denote what info was used:

Many papers report several numbers for different settings. We focused on the best results. mAP is reported as a percentage (ie. 25 % = 0.25). Sometimes figures were read from a plot, in which case the precision is limited. The difference between local and global descriptors is blurry, a plain bag-of-words can be considered as a list of non-0 local features or as a global histogram.

For additions/corrections, contact matthijs dot douze at inria dot fr.
Reference Holidays mAP Holidays + 1M mAP Notes
Local descriptors
"Hamming embedding and weak geometric consistency for large scale image search", Hervé Jégou, Matthijs Douze, Cordelia Schmid, ECCV 2008 75.07 61.8 Introduced the dataset
"On the burstiness of visual elements" Hervé Jégou, Matthijs Douze, Cordelia Schmid, CVPR 2009 QE: 83.7
QE, SP: 84.8
QE: 68.8
QE, SP: 77.32
Reweight matches to take burstiness into account
"Efficient Representation of Local Geometry for Large Scale Object Retrieval", Michal Perdoch, Ondrej Chum, Jiri Matas, CVPR 2009 SP: 82.8 As the method exploits the "gravity vector", the images were manually rotated into the correct orientation.
"Improving bag-of-features for large scale image search", Hervé Jégou, Matthijs Douze, Cordelia Schmid, IJCV 2010 81.3
SP: 84.8
SP: 75.4
Added spatial verification, multiple assignment. Also used query expansion, but this did not improve over the simple spatial verification.
"Learning a Fine Vocabulary", A. Mikulik, M. Perdoch, O. Chum, J. Matas, ECCV 2010 SP,QE: 75.8 The images were rotated to correct their orientation
"Exploiting descriptor distances for precise image search" Hervé Jégou; Matthijs Douze; Cordelia Schmid, INRIA research report, 2011 QE: 86.8 -
"Hello neighbor: accurate object retrieval with k-reciprocal nearest neighbors", Qin Danfeng, Stephan Gammeter, Lukas Bossard, Till Quack, Luc VanGool, CVPR 2011 QE: 42.3 -
"Contextual Weighting for Vocabulary Tree based Image Retrieval", Xiaoyu Wang, Ming Yang, Timothee Cour, Shenghuo Zhu, Kai Yu, Tony X. Han, ICCV 2011 78.0 57 Large vocabulary, then weighting of the entries. Relevant remarks on comparing results.
"Asymmetric Hamming Embedding", Mihir Jain, Hervé Jégou, Patrick Gros, ACM MM 2011 81.9 - Asymmetric coding of local descriptors
"Object Retrieval and Localization with Spatially-constrained Similarity Measure and k-NN Re-ranking", Xiaohui Shen, Zhe Lin, Jonathan Brandt, Shai Avidan, Ying Wu, CVPR 2012 76.2 Shortlist reranking. As Holidays has few results per query, the reranking is not very useful.
"Embedding Spatial Context Information into Inverted File for Large-Scale Image Retrieval", Zhen Liu, Houqiang Li, Wengang Zhou, Qi Tian, ACM MM 2012 60 38 -
"Visual Place Recognition with Repetitive Structures", Akihiko Torii, Josef Sivic, Tomas Pajdla, Masatoshi Okutomi, CVPR 2013 74.95 - -
"To aggregate or not to aggregate: selective match kernels for image search", Giorgos Tolias, Yannis Avrithis and Hervé Jégou, ICCV 2013 88.0 - mAP=81.0 with a binary embedding and the standard descriptors from the website
"Query Adaptive Similarity for Large Scale Object Retrieval", Danfeng Qin, Christian Wengert, Luc van Gool, CVPR 2013 84.4 -
"Semantic-aware Co-indexing for Image Retrieval", Shiliang Zhang, Ming Yang, Xiaoyu Wang, Yuanqing Lin, Qi Tian, ICCV 2013 80.86 63.34 uses another set of 1.3M distractor images
"Packing and Padding: Coupled Multi-index for Accurate Image Retrieval", Liang Zheng, Shengjin Wang , Ziqiong Liu, and Qi Tian, CVPR 2014 84.0, SP 85.8 69 Combines SIFT with local color descriptor in inverted file
"Bayes Merging of Multiple Vocabularies for Scalable Image Retrieval" Liang Zheng, Shengjin Wang , Wengang Zhou, and Qi Tian, CVPR 2014 81.92 40 The result for 1M images does not include HE
"Locality in Generic Instance Search from One Example", Ran Tao, Efstratios Gavves, Cees G.M. Snoek, Arnold W.M. Smeulders, CVPR 2014 78.7 Similar to Tolias et al ICCV13 except that they use FV instead of VLAD to aggregate descriptors in a centroid and PQ instead of HE for encoding
"Early burst detection for memory-efficient image retrieval" Shi, Avrithis, Jegou, CVPR 2015 88.1
"Query-Adaptive Late Fusion for Image Search and Person Re-identification" Liang Zheng, Shengjin Wang, Lu Tian, Fei He, Ziqiong Liu, and Qi Tian, CVPR 2015 88.0 75.3 Mixture of BoW + GIST + RAND + HS + CNN
"Pairwise Geometric Matching for Large-scale Object Retrieval", Xinchao Li, Martha Larson, Alan Hanjalic, CVPR 2015 SP 89.2 SP 85 With experiments on 10M images
Global descriptors
"Evaluation of GIST descriptors for web-scale image search" Matthijs Douze, Hervé Jégou, Sandhawalia Harsimrat, Laurent Amsaleg, Cordelia Schmid, CIVR 2009 37.6 With GIST descriptors (960 dim, uncompressed version)
"Packing bag-of-features" Hervé Jégou, Matthijs Douze, Cordelia Schmid, ICCV 2009 55.4 (bin. BOF)
45.2 (miniBOF)
38.1 (bin. BOF)
24.4 (miniBOF)
"Aggregating local descriptors into a compact image representation", Hervé Jégou, Matthijs Douze, Cordelia Schmid, Patrick Pérez, CVPR 2010 52.6 32.1 With global VLAD descriptor (8192 dim).
"Combining attributes and Fisher vectors for efficient image retrieval" Matthijs Douze and Arnau Ramisa and Cordelia Schmid, CVPR 2011 69.9 6755 dim global descriptor, from many channels (BOW, GIST, color, etc.)
"Bag-of-colors for improved image search", Christian Wengert, Matthijs Douze, Hervé Jégou, ACM Multimedia, 2011 LD: 63.8 A global color descriptor (256 dim)
"Large-Scale Image Retrieval with Compressed Fisher Vectors", Florent Perronnin, Yan Liu, Jorge Sánchez, Hervé Poirier, CVPR 2011 70 64 Evaluation of Fisher vectors for retrieval. Numbers for Holidays and Holidays + 1M are not for the same method
"Asymmetric Distances for Binary Embeddings", Albert Gordo, Florent Perronnin, CVPR 2011 60.8 37 Not same method for Holidays and Holidays+1M. The later uses 128 bytes / image on the database side.
"Aggregating local image descriptors into compact codes", Hervé Jégou, Florent Perronnin, Matthijs Douze, Jorge Sánchez, Patrick Pérez, Cordelia Schmid, PAMI 2012 68.9 With Fisher descriptors (262k dim)
"Query Specific Fusion for Image Retrieval", Shaoting Zhang, Ming Yang, Timothee Cour, Kai Yu, Dimitris N. Metaxas, ECCV 2012 QE: 84.64 10M vocabulary BOW + GIST + color descriptor
"Negative evidences and co-occurrences in image retrieval: the benefit of PCA and whitening" Hervé Jégou, Ondrej Chum, ECCV 2012 61.4 VLAD in 128 dim, with various improvements
"Leveraging Category-Level Labels For Instance-Level Image Retrieval", Albert Gordo, José A. Rodríguez-Serrano, Florent Perronnin, Ernest Valveny, CVPR 2012 76.8 68 Comparable with Douze&al. CVPR 2011, with cleaner setup and better results.
"Weakly Supervised Sparse Coding with Geometric Consistency Pooling", Liujuan Cao, Rongrong Ji, Yue Gao, Yi Yang, Qi Tian, CVPR 2012 LD: 79 LD: 62 Presume learnt on dataset
"All about VLAD", Relja Arandjelovic, Andrew Zisserman, CVPR 2013 64.6 - More figure in the paper with different tradeoffs
"Visual Reranking through Weakly Supervised Multi-Graph Learning", Cheng Deng, Rongrong Ji, Wei Liu, Dacheng Tao, Xinbo Gao, ICCV 2013 QE: 84.7 QE: 79.4 Obtained with very low-dimensional features (BoF + GIST + HSV) total < 4000 D (?)
"Multi-scale Orderless Pooling of Deep Convolutional Activation Features", Y. Gong, L. Wang, R. Guo, and S. Lazebnik, ECCV 2014 80.18 With low-dimensional CNN-based features (2048 D)
"Triangulation embedding and democratic aggregation for image search" Jegou, Zisserman, CVPR 2014 77.1 in 8064D mAP=72 when reduced to 1024D
"Exemplar SVMs as Visual Feature Encoders" Joaquin Zepeda and Patrick Perez, CVPR 2015 78.3 71
"FAemb: a function approximation-based embedding method for image retrieval" Thanh-Toan Do, Quang D. Tran, Ngai-Man Cheung, CVPR 2015 75.8
"Fisher Vectors Meet Neural Networks: A Hybrid Classification Architecture" Florent Perronnin and Diane Larlus, CVPR 2015 84.7 in 4096D Lower levels: SIFT + FV then fully-connected network
"Sparse Composite Quantization", Ting Zhang, Guo-Jun Qi, Jinhui Tang, Jingdong Wan, CVPR 2015 64.4 Ultra-compact image descriptors (128 bits)
Uses dataset, but does not report results
"Transform Coding for Fast Approximate Nearest Neighbor Search in High Dimensions", Jonathan Brandt, CVPR 2010 Uses SIFT descriptors from Holidays for NN search
"Robust Fusion: Extreme Value Theory for Recognition Score Normalization" W. Scheirer, A. Rocha, R. Micheals, and T. Boult, ECCV 2010 Only reports improvements over a baseline algorithm
"Reconstructing an image from its local descriptors" Philippe Weinzaepfel, Herve Jegou, Patrick Perez, CVPR 2011 Not at all about image retrieval (but still interesting!)
"Collaborative Hashing", Xianglong Liu, Junfeng He, Cheng Deng, Bo Lang, CVPR 2014 Only descriptors used
"Metric imitation by manifold transfer for efficient vision applications" Dengxin Dai, Till Kroeger, Radu Timofte, Luc Van Gool CVPR 15 Metric learning. Reports results on 1/2 the dataset (other half used for training)

matthijs douze
Last modified: 2014-02-27