ICCV 2003 Short Course

Dense Multiview Stereo

Instructors: Steve Seitz (University of Washington), Richard Szeliski (Microsoft Research) and Ramin Zabih (Cornell University).

Duration: 3.5 hours

Course Content

While the classical vision problem of 2-camera stereo has been studied for decades, in the last few years there has been a surge of interest in dense shape reconstruction from multiple views. The course will begin with an introduction to stereo matching, setting up the problem as one of finding correspondences among multiple images and constructing a 3D model. After a short discussion of rectification, we will give a taxonomy of dense stereo algorithms into local and global algorithms. We will then introduce issues pertaining to baseline (tradeoff accuracy vs. search) to motivate multi-baseline stereo. This naturally leads to the issue of visibility, which we analyze within a volume-based framework. We will cover the concepts of photo-consistency and visual hulls, and several multiview algorithms including voxel coloring, space carving, and level sets. Finally, we will discuss in some detail a recent group of algorithms, based on graph cuts, that allow the incorporation of spatial smoothness into dense multiview stereo.

Biographies

Steve Seitz is an Associate Professor in the Department of Computer Science and Engineering at the University of Washington. He received his B.A. in computer science and mathematics at the University of California, Berkeley in 1991 and his Ph.D. in computer sciences at the University of Wisconsin, Madison in 1997. Following his doctoral work, he spent one year visiting the Vision Technology Group at Microsoft Research, and subsequently two years as an Assistant Professor in the Robotics Institute at Carnegie Mellon University. He joined the faculty at the University of Washington in July 2000. He was twice awarded the David Marr Prize for the best paper at the International Conference of Computer Vision, and has received an NSF Career Award, an ONR Young Investigator Award, and an Alfred P. Sloan Fellowship.

Richard Szeliski is a Senior Researcher in the Interactive Visual Media Group at Microsoft Research, where he is pursuing research in 3-D computer vision, video scene analysis, and image-based rendering. He received a Ph. D. degree in Computer Science from Carnegie Mellon University, Pittsburgh, in 1988. Dr. Szeliski has published over 100 research papers in computer vision, computer graphics, medical imaging, neural nets, and parallel numerical algorithms, as well as the book Bayesian Modeling of Uncertainty in Low-Level Vision. He is on the editorial board of the International Journal of Computer Vision, and served as Program Chair for ICCV'2001, organizer of the ICCV'99 Workshop on Vision Algorithms, and Associate Editor of the IEEE Transactions on Pattern Analysis and Machine Intelligence.

Ramin Zabih is an Associate Professor in the Computer Science Department at Cornell University. He received undergraduate degrees in computer and in mathematics, and a master's in computer science, at MIT, followed by a Ph.D. in computer science from Stanford in 1994. His research has focused on the use of graph algorithms to solve problems in low-level vision. He organized the ICCV'99 Workshop on Graph Algorithms and Computer Vision, and co-edited a special issue of the IEEE Transactions on Pattern Analysis and Machine Intelligence on this topic. He shared the best paper award at the European Conference on Computer Vision in 2002 for a pair of papers describing the use of graph algorithms to compute dense multiview stereo.