Category level object segmentation by combining bag-of-words models with Dirichlet processes and random fields
International Journal of Computer Vision, Volume 88, Number 2 - jun 2010
Download the publication :
This paper addresses the problem of accurately
segmenting instances of object classes in images without
any human interaction. Our model combines a bag-of-words
recognition component with spatial regularization based on
a random field and a Dirichlet process mixture. Bag-ofwords
models successfully predict the presence of an object
within an image; however, they can not accurately locate
object boundaries. Random Fields take into account the
spatial layout of images and provide local spatial regularization.
Yet, as they use local coupling between image labels,
they fail to capture larger scale structures needed for object
recognition. These components are combined with a Dirichlet
process mixture. It models images as a composition of
regions, each representing a single object instance. Gibbs
sampling is used for parameter estimations and object segmentation.
Our model successfully segments object category instances,
despite cluttered backgrounds and large variations
in appearance and viewpoints. The strengths and limitations
of our model are shown through extensive experimental
evaluations. First, we evaluate the result of two methods to
build visual vocabularies. Second, we show how to combine
strong labeling (segmented images) with weak labeling (images
annotated with bounding boxes), in order to limit the labeling
effort needed to learn the model. Third, we study the
effect of different initializations. We present results on four
image databases, including the challenging PASCAL VOC
2007 data set on which we obtain state-of-the art results.
Images and movies
BibTex references
@Article{LVJ10,
author = "Diane Larlus and Jakob Verbeek and Fr\'ed\'eric Jurie",
title = "Category level object segmentation by combining bag-of-words models with Dirichlet processes and random fields",
journal = "International Journal of Computer Vision",
number = "2",
volume = "88",
pages = "238--253",
month = "jun",
year = "2010",
keywords = "LEAR, LJK, CLASS",
url = "http://lear.inrialpes.fr/pubs/2010/LVJ10"
}
Other publications by...