Here's a collection of Matlab scripts available for non-commercial use. Please email for questions & suggestions.
A compact and efficient kmeans implementation. 12/7/2011
[Matlab implementation]
A MATLAB implementation of the Coordinated Factor Analysis (CFA) model described in my 2006 PAMI paper can be found in the Matlab Toolbox for Dimensionality Reduction by Laurens van der Maaten (thanks!).
Implementation of (smoothed) LDA and PLSA.
Includes the option to fix the word-topic distributions to evaluate the topic distributions for new documents.
Contains a number of options to estimate/fix the hyper-parameters alpha and eta, to use point-estimates / Dirichlet estimates of theta and beta (if both are point estimates PLSA is recovered), to have (non-)symmetric
priors on theta and beta, etc.
Code requires Tom Minka's Lightspeed and FastFit toolboxes.
August 23, 2006.
[Matlab implementation]
[Blei's paper on LDA]
This MATLAB code implements Binary PCA, and mixtures and HMM's with Binary PCA components.
Like normal PCA, Binary PCA is based finds a low-rank approximation for a given data matrix.
In the case of normal PCA, the approximation error is not given by the Frobenius norm of the residual matrix.
In the case of Binary PCA, the approximation error is given by the summed log-likelihood of the entries of the data matrix where the likelihood of each entry is given by a Bernoulli distribution whose log-odds parameter is given by corresponding entry in the low-rank matrix.
Rather than just binary, the data matrix may also contain scalars in [0,1] in which case a weighted log-likelihood is calculated.
May 14, 2007.
[Matlab implementation]
[Binary PCA paper by Schein et al.]
A compact matlab script performing the EM iterations for PLSA.
Includes the option to fix the word-topic distributions to evaluate the topic distributions for new documents.
May 14, 2007.
[Matlab implementation]
[Hofmann's paper on PLSA]
Implementation of the Mixture of
Factor Analyzers model. Allows setting noise models to be equal for all
components and/or to set the noise model to be isotropic. In the
latter case the Mixture of Probabilistic Principal Component Analyzers
is obtained. October 3, 2005.
[Matlab implementation]
[Paper on mixtures of probabilistic PCA]
[Paper on mixtures of Factor Analyzers]
Standard mixture learning algorithms like EM and k-means are slow for large datasets.
For k-means there exists an accelerated version that uses a kd-tree and is exact
(Pelleg and Moore, 1999).
A similar approximate technique exists for EM
(Moore, 1999)
but with no convergence guarantees.
In our 2006 DMKD paper we present a variational approximation to the EM algorithm for Gaussian mixtures which results in a provably convergent scheme with speedups that are at least linear with the sample size.
This code also implements our greedy mixture of Gaussian learning algorithm from the 2003 Neural Computation paper.
[Matlab implementation]
[Data Mining and Knowledge Discovery 2006 paper]
[Neural Computation 2003 paper]
By optimization of free-energy with a constrained
EM algorithm we obtain an algorithm very
similar to Kohonen's SOM, but which proovavbly converges and optimizes an objective function.
[Matlab implementation]
[Neurocomputing paper]
A Matlab script that performs EM to find principal components. Missing data is handeled
by using a variational EM algorithm, which allows the algorithm to have
runtime linear in number of data, number of data dimensions and number
of principal components. The objective function being optimized
is a lower-bound on data log-likelihood. Based on "sensible principal
components analysis" by Sam Roweis.
If you use this code please cite this paper , in the context of which the code was written.
[Matlab implementation + PDF Note]
An algorithm for vector quantization that
builds the solution by iteratively inserting quantizers.
[Matlab
implementation]
[Pattern Recognition paper]
An algorithm that finds principal curves
by fitting a set of local linear models which are combined to form curves.
[Matlab implementation]
[Pattern Recognition Letters paper]