Zeynep Akata

a. Stanford Dogs Dataset

Stanford Dogs dataset [2] is composed of 120 classes of dog images with approximately 150 images per class, 100 of which are chosen as training images. There are 20,580 images in total. The classes are members of ILSVRC2012 dataset. Click here for the visualization of that hierarchy.

b. Oxford-IIIT Pets Dataset

Oxford-IIIT Pets Dataset [1] contains 37 classes, 12 of which are cat breeds and 35 of which are dog breeds. There are 3680 images for training+val and 3669 images for testing which makes a total of 7349 images. The model that the authors in [1] capture the pet shape, the appearence of its fur and involves automatically segmenting the pet from the background.

c. Animals With Attributes Dataset

Animals with Attributes Dataset consists of 30475 images of 50 animals classes with 85 attributes defined for each class. We made use of the ImageNet hierarchy to plot the relationship of the classes. Click here for the visualization of that hierarchy.

1. Experiments with Full Datasets

We have compared weighted-OVR and label embedding methods using 4096 and 65536 dim FV constructed using color and SIFT features concatenated, without Spatial Pyramids. The results of classification can be seen on Tables below.

**Stanford Dogs Dataset**
[2] features explained below	w-OVR		Label Embedding
	4096dim	65536dim	4096dim	65536dim
22.22	26.23	38.51	25.40	35.45

[2]: http://vision.stanford.edu/aditya86/ImageNetDogs/
features: grayscale SIFT, 256dim, 1+2+4+8 SP, histogram intersection kernel

**Oxford-IIIT Pets Dataset (baseline results)**
[1] only image layout	[1] shape + image	[1] shape + image + head + body+ segmentation
39.64	43.30	59.21

[1]: Cats and Dogs; Omkar M Parkhi, Andrea Vedaldi, Andrew Zisserman, C. V. Jawahar; CVPR2012

**Oxford-IIIT Pets Dataset (our results)**
	w-OVR (image)	Label Embedding (image)
4096dim FV	39.80	42.42
65536dim FV	53.51	51.27

2. Experiments with Subsets of Datasets

We also performed experiments by choosing 1,2,5,10 and 25 images for training and same testing set. The results are shown on tables below.

**Stanford Dogs Dataset**
	nimg=1	nimg=2	nimg=5	nimg=10	nimg=25
4096dim FV	2.50	3.41	5.87	9.32	14.92
65536dim FV	3.02	3.74	7.06	11.40	19.11

**Oxford-IIIT Pets Dataset**
	nimg=1	nimg=2	nimg=5	nimg=10	nimg=25
4096dim FV	8.67	9.78	15.63	22.49	30.10
65536dim FV	9.18	12.19	18.66	25.49	35.90