Data sets

Most of the data sets used in my recent papers is available from the LEAR data webpage.

Data for non-linear dimensionality reduction

If we observe a system with only few degrees of freedom with a high dimensional sensor (eg a camera measuring thousands of dimensions given by the pixel-color intensities) and we assume that the mapping from the state of the system (eg direction a face is looking) to our measurement of it (the image we make of the face) is continous, then our measurements will live on some low dimensional subspace of the sensor space. The subspace might be a non-linear subspace which renders linear feature extraction methods (eg Principal Component Analysis, Independent Component Analysis) unfit to extract the data subspace.

Two of the data sets used in my papers are available via http.

Collection of 100 images of a face that looks from left to right (20 dim. PCA).
Collection of 2000 images of a face that also looks up and down (100 dim PCA).

Both sets are matlab files containing the following variables

IMS	rows of 1600 columns which contain 40x40 pixel gray value images (the mean image is subtracted!)
MEAN	mean image
PCA	PCA projection of the images
varX	variances in the individual pixels
Evals	leading eigenvalues of the covariance matrix of IMS
Evecs	corresponding leading eigenvectors of the cov. matrix of IMS used to map from PCA to IMS

I used these data sets in several papers, including papers in 2006 in IEEE PAMI and Pattern Recognition.