BlitzNet: A Real-Time Deep Network for Scene Understanding

People

Abstract

Real-time scene understanding has become crucial in many applications such as autonomous driving. In this paper, we propose a deep architecture, called BlitzNet, that jointly performs object detection and semantic segmentation in one forward pass, allowing real-time computations. Besides the computational gain of having a single network to perform several tasks, we show that object detection and semantic segmentation benefit from each other in terms of accuracy. Experimental results for VOC and COCO datasets show state-of-the-art performance for object detection and segmentation among real time systems.

Architecture

Paper

ICCV 2017 Paper

BibTeX

@inproceedings{dvornik17blitznet,
    AUTHOR = {Dvornik, Nikita and Shmelkov, Konstantin and Mairal, Julien and Schmid, Cordelia},
    TITLE = {{BlitzNet}: A Real-Time Deep Network for Scene Understanding},
    BOOKTITLE = {{IEEE International Conference on Computer Vision (ICCV)}},
    YEAR = {2017},
}

Code

Code is available on GitHub.

Acknowledgements

This work was supported by a grant from ANR (MACARON, ANR-14-CE23-0003-01) and by the ERC projects SOLARIS and ALLEGRO. We gratefully acknowledge the Intel gift and the support of NVIDIA Corporation with the donation of GPUs used for this research.

Copyright Notice

The documents contained in these directories are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. This page style is taken from Guillaume Seguin.