If we observe a system with only few degrees of freedom with a high dimensional sensor (eg a camera measuring thousands of dimensions given by the pixel-color intensities) and we assume that the mapping from the state of the system (eg direction a face is looking) to our measurement of it (the image we make of the face) is continous, then our measurements will live on some low dimensional subspace of the sensor space. The subspace might be a non-linear subspace which renders linear feature extraction methods (eg Principal Component Analysis, Independent Component Analysis) unfit to extract the data subspace.
Two of the data sets used in my papers are available via http.
Both sets are matlab files containing the following variables
IMS | rows of 1600 columns which contain 40x40 pixel gray value images (the mean image is subtracted!) |
MEAN | mean image |
PCA | PCA projection of the images |
varX | variances in the individual pixels |
Evals | leading eigenvalues of the covariance matrix of IMS |
Evecs | corresponding leading eigenvectors of the cov. matrix of IMS used to map from PCA to IMS |
I used these data sets in several papers, including papers in 2006 in IEEE PAMI and Pattern Recognition.