Sampling for Learning and Matching with Spatial Statistical Models
D. Huttenlocher
Spatial statistical models have recently become widely used for part-based
two-dimensional object recognition. These models are generally applied by
computing the maximum
a posteriori (MAP) estimate, or equivalently formulating an energy minimization
problem.
For reasons of computational tractability, such models generally do not capture
constraints regarding the overlap of parts, although a more refined model
that correctly accounts for overlap is often applied after finding the MAP
solution
(such as the POP criterion introduced by Amit and Trouve). We have found that
when using such an approach, sampling high posterior probability configurations
rather than using the MAP estimate produces significantly higher object detection
performance on standard datasets. It also produces substantially higher log
likelihoods (lower energy), suggesting that the matches found by sampling
are much better fits to the models than those found by optimization. A number
of
researchers have questioned the utility of statistical models, when all
they are used for is to pose energy minimization problems that could be derived
without use of statistical formalisms. We argue that the improved performance
obtained by sampling illustrates the power of taking a statistical approach.