a. Stanford Dogs Dataset
Stanford Dogs dataset [2] is composed of 120 classes of dog images with approximately 150 images per class, 100 of which are chosen as training images. There are 20,580 images in total. The classes are members of ILSVRC2012 dataset.
Click here for the visualization of that hierarchy.
b. Oxford-IIIT Pets Dataset
Oxford-IIIT Pets Dataset [1] contains 37 classes, 12 of which are cat breeds and 35 of which are dog breeds. There are 3680 images for training+val and 3669 images for testing which makes a total of 7349 images. The model that the authors in [1] capture the pet shape, the appearence of its fur and involves automatically segmenting the pet from the background.
c. Animals With Attributes Dataset
Animals with Attributes Dataset consists of 30475 images of 50 animals classes with 85 attributes defined for each class. We made use of the ImageNet hierarchy to plot the relationship of the classes.
Click here for the visualization of that hierarchy.
1. Experiments with Full Datasets
We have compared weighted-OVR and label embedding methods using 4096 and 65536 dim FV constructed using color and SIFT features concatenated, without Spatial Pyramids. The results of classification can be seen on Tables below.
Stanford Dogs Dataset
[2] features explained below | w-OVR | Label Embedding |
| 4096dim | 65536dim | 4096dim | 65536dim |
22.22 | 26.23 | 38.51 | 25.40 | 35.45 |
[2]:
http://vision.stanford.edu/aditya86/ImageNetDogs/
features: grayscale SIFT, 256dim, 1+2+4+8 SP, histogram intersection kernel
Oxford-IIIT Pets Dataset (baseline results)
[1] only image layout | [1] shape + image | [1] shape + image + head + body+ segmentation |
39.64 | 43.30 | 59.21 |
[1]: Cats and Dogs; Omkar M Parkhi, Andrea Vedaldi, Andrew Zisserman, C. V. Jawahar; CVPR2012
Oxford-IIIT Pets Dataset (our results)
| w-OVR (image) | Label Embedding (image) |
4096dim FV | 39.80 | 42.42 |
65536dim FV | 53.51 | 51.27 |
2. Experiments with Subsets of Datasets
We also performed experiments by choosing 1,2,5,10 and 25 images for training and same testing set. The results are shown on tables below.
Stanford Dogs Dataset
| nimg=1 | nimg=2 | nimg=5 | nimg=10 | nimg=25 |
4096dim FV | 2.50 | 3.41 | 5.87 | 9.32 | 14.92 |
65536dim FV | 3.02 | 3.74 | 7.06 | 11.40 | 19.11 |
Oxford-IIIT Pets Dataset
| nimg=1 | nimg=2 | nimg=5 | nimg=10 | nimg=25 |
4096dim FV | 8.67 | 9.78 | 15.63 | 22.49 | 30.10 |
65536dim FV | 9.18 | 12.19 | 18.66 | 25.49 | 35.90 |