LCR-Net: Real-time multi-person 2D and 3D human pose estimation

Grégory Rogez Philippe Weinzaepfel Cordelia Schmid
CVPR 2017 -- IEEE Trans. on PAMI 2019

Abstract

We propose an end-to-end architecture for real-time 2D and 3D human pose estimation in natural images. Key to our approach is the generation and scoring of a number of pose proposals per image, which allows us to predict 2D and 3D pose of multiple people simultaneously. Hence, our approach does not require an approximate localization of the humans for initialization. Our architecture, named LCR-Net, contains 3 main components: 1) the pose proposal generator that suggests potential poses at different locations in the image; 2) a classifier that scores the different pose proposals ; and 3) a regressor that refines pose proposals both in 2D and 3D. All three stages share the convolutional feature layers and are trained jointly. The final pose estimation is obtained by integrating over neighboring pose hypotheses , which is shown to improve over a standard non maximum suppression algorithm. Our approach significantly outperforms the state of the art in 3D pose estimation on Human3.6M, a controlled environment. Moreover, it shows promising results on real images for both single and multi-person subsets of the MPII 2D pose benchmark.

Real-time Multi-person Human Pose Estimation (2D+3D) Demo @ CVPR'18 (v1.0) and ECCV'18 (v 2.0)

History

September 2018: release v2.0
- pytorch code with ResNet backbone
August 2018: release v1.1
- bug fix in ppi
- show the 3d scene instead of individual 3d poses
March 2018: new models available (trained with additional synth. data, details in arXiv'18 paper)
July 2017: release v1.0

Pytorch code (v2.x) with ResNet backbone

This release is for scientific or personal use only. It includes code for testing existing models.
For commercial use and licensing of the training pipeline, contact us at:
stip-gra@inria.fr

Installation

Download and install https://github.com/roytseng-tw/Detectron.pytorch (we do not provide support for its installation).
Download and unzip LCR-Net code (v2.0)
Create a symbolic link in the LCR-Net folder to Detectron.pytorch

Usage

To test a given model:

python demo.py <modelname> <imagename> <gpuid>

with:

modelname: name of the model (see list of available models below)
imagename: name of the image to test
gpuid: id of the gpu to use (-1 for CPU, default value)

The list of available models are:

DEMO_ECCV18: model with fast inference time (downscale image, reduce number of classes) that we use for our ECCV'18 human pose estimation demo
Human3.6-17J-ResNet50: model trained (and evaluated) on Human3.6M dataset to estimate 17 joints
InTheWild-ResNet50: model trained on real-world (and synthetic) images evaluated on MPII dataset

Example to test our model trained on Human 3.6M:

python demo.py Human3.6-17J-ResNet50 Directions1_S11_C1_1.jpg 0

Example to test our model for in the wild pose detection:

python demo.py InTheWild-ResNet50 058017637.jpg 0

Problem loading the pickle model

You may get errors when loading the pickle file containing the cnn weights, anchor poses and parameters. If this is the case, please replace L120-L131 of demo.py by the following line.

model = {} for suffix in ['_model.pth.tgz', '_ppi_params.pkl', '_anchor_poses.pkl', 'cfg.pkl']: fname = os.path.join( os.path.dirname(__file__), 'models', modelname+suffix) if not os.path.isfile(fname): # Download the files if not os.path.isdir(os.path.dirname(fname)): os.system('mkdir -p "{:s}"'.format(dirname)) os.system('wget http://pascal.inrialpes.fr/data2/grogez/LCR-Net/pthmodels/{:s} -P {:s}'.format(modelname+suffix, dirname)) if not os.path.isfile(fname): raise Exception("ERROR: download incomplete") if fname.endswith('pkl'): with open(fname, 'rb') as fid: model[suffix[1:-4]] = pickle.load(fid) else: model['model'] = torch.load(fname)

Citation

If you use our new models, please cite our PAMI paper:

@article{RogezWS18,
  TITLE = {{LCR-Net++: Multi-person 2D and 3D Pose Detection in Natural Images}},
  AUTHOR = {Rogez, Gr\'egory and Weinzaepfel, Philippe and Schmid, Cordelia},
  JOURNAL = {{IEEE Transactions on Pattern Analysis and Machine Intelligence}}, 
  YEAR = {2019},
  publisher = {IEEE},
}

Caffe code (v1.x) with VGG backbone

Please note that our code is released only for scientific or personal use.

We only provide code for testing our models, not for training.

Installation

Download and install py-faster-rcnn (we do not provide support for its installation).
Download and unzip LCR-Net code (v1.1)
Create a symbolic link in the LCR-Net folder to py-faster-rcnn

Usage

To test our model trained on Human 3.6M:

python demo.py h36m_100_p2 Directions1_S11_C1_1.jpg 0

To test our model for in the wild pose detection:

python demo.py mix_200x2 058017637.jpg 0

The arguments are:

modelname among the available ones:
- h36m_100_p2: model trained on the Human3.6M dataset using the second protocol, as described in our paper
- mix_200x2: model that we use on MPII, trained on MPII, LSP, LSP-extended, Coco and Human3.6M
image filename
GPU id to use (-1 for CPU, default value)

Citation

If you use our code, please cite our CVPR'17 paper:

@inproceedings{rogez:hal-01505085,
  TITLE = {{LCR-Net: Localization-Classification-Regression for Human Pose}},
  AUTHOR = {Rogez, Gregory and Weinzaepfel, Philippe and Schmid, Cordelia},
  BOOKTITLE = {{CVPR}},
  ADDRESS = {Honolulu, United States},
  YEAR = {2017},
  MONTH = July,
}