I. Akhter, Y. A. Sheikh, S. Khan, and T. Kanade, Nonrigid structure from motion in trajectory space, NIPS, 2008.

M. Arie-nachimson and R. Basri, Constructing implicit 3D shape models for pose estimation, 2009 IEEE 12th International Conference on Computer Vision, 2009.
DOI : 10.1109/ICCV.2009.5459310

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.186.4297

H. Azizpour and I. Laptev, Object detection using stronglysupervised deformable part models, ECCV, 2012.
DOI : 10.1007/978-3-642-33718-5_60

URL : https://hal.archives-ouvertes.fr/hal-01063338

L. Bourdev, S. Maji, T. Brox, and J. Malik, Detecting People Using Mutually Consistent Poselet Activations, ECCV, 2010.
DOI : 10.1007/978-3-642-15567-3_13

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.178.1823

M. D. Breitenstein, F. Reichlin, B. Leibe, E. Koller-meier, and L. V. , Robust tracking-by-detection using a detector confidence particle filter, 2009 IEEE 12th International Conference on Computer Vision, 2009.
DOI : 10.1109/ICCV.2009.5459278

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.229.2429

X. Chen, R. Mottaghi, X. Liu, N. Cho, S. Lee et al., Detect What You Can: Detecting and Representing Objects Using Holistic Models and Body Parts, 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014.
DOI : 10.1109/CVPR.2014.254

URL : http://arxiv.org/abs/1406.2031

X. Chen and A. Yuille, Articulated pose estimation by a graphical model with image dependent pairwise relations, NIPS, 2014.

Y. Chen, L. Zhu, and A. Yuille, Active Mask Hierarchies for Object Detection, ECCV, 2010.
DOI : 10.1007/978-3-642-15555-0_4

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.400.9818

Y. Dai, H. Li, and M. He, A Simple Prior-Free Method for Non-rigid Structure-from-Motion Factorization, CVPR, 2012.
DOI : 10.1109/ICCV.2011.6126529

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.259.2690

N. Dalal and B. Triggs, Histograms of Oriented Gradients for Human Detection, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), 2005.
DOI : 10.1109/CVPR.2005.177

URL : https://hal.archives-ouvertes.fr/inria-00548512

M. Everingham, L. Van-gool, C. K. Williams, J. Winn, and A. Zisserman, The Pascal Visual Object Classes (VOC) Challenge, International Journal of Computer Vision, vol.73, issue.2, pp.303-338, 2010.
DOI : 10.1371/journal.pcbi.0040027

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.167.6629

P. Felzenszwalb, R. Girshick, D. Mcallester, and D. Ramanan, Object Detection with Discriminatively Trained Part-Based Models, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.32, issue.9, pp.1627-1645, 2010.
DOI : 10.1109/TPAMI.2009.167

P. F. Felzenszwalb and D. P. Huttenlocher, Pictorial Structures for Object Recognition, International Journal of Computer Vision, vol.61, issue.1, pp.55-79, 2005.
DOI : 10.1023/B:VISI.0000042934.15159.49

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.12.6365

S. Fidler, S. Dickinson, and R. Urtasun, 3d object detection and viewpoint estimation with a deformable 3d cuboid model, NIPS, 2012.

S. Fidler, R. Mottaghi, A. Yuille, and R. Urtasun, Bottom-Up Segmentation for Top-Down Detection, 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013.
DOI : 10.1109/CVPR.2013.423

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.296.7948

M. Fischler and R. Elschlager, The Representation and Matching of Pictorial Structures, IEEE Transactions on Computers, vol.22, issue.1, pp.67-92, 1973.
DOI : 10.1109/T-C.1973.223602

R. Girshick, P. Felzenszwalb, and D. Mcallester, Object detection with grammar models, NIPS, 2011.

R. B. Girshick, J. Donahue, T. Darrell, and J. Malik, Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014.
DOI : 10.1109/CVPR.2014.81

URL : http://arxiv.org/abs/1311.2524

M. Hejrati and D. Ramanan, Analyzing 3d objects in cluttered images, NIPS, 2012.

D. Hoiem, A. A. Efros, M. Hebert, C. Ionescu, D. Papava et al., Closing the loop in scene interpretation Human3.6m: Large scale datasets and predictive methods for 3d human sensing in natural environments, CVPR, pp.361325-1339, 2008.

Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long et al., Caffe, Proceedings of the ACM International Conference on Multimedia, MM '14, 2014.
DOI : 10.1145/2647868.2654889

A. Krizhevsky, I. Sutskever, and G. E. Hinton, ImageNet classification with deep convolutional neural networks, Communications of the ACM, vol.60, issue.6, 2012.
DOI : 10.1162/neco.2009.10-08-881

Y. Lecun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard et al., Backpropagation Applied to Handwritten Zip Code Recognition, Neural Computation, vol.1, issue.4, 1989.
DOI : 10.1007/BF00133697

J. J. Lim, A. Khosla, and A. Torralba, FPM: Fine Pose Parts-Based Model with 3D CAD Models, ECCV, 2014.
DOI : 10.1007/978-3-319-10599-4_31

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.456.5236

Y. Ma, S. Soatto, J. Kosecka, and S. Sastry, An invitation to 3-d vision: from images to geometric models, 2004.
DOI : 10.1007/978-0-387-21779-6

O. Parkhi, A. Vedaldi, C. Jawahar, and A. Zisserman, The truth about cats and dogs, 2011 International Conference on Computer Vision, 2011.
DOI : 10.1109/ICCV.2011.6126398

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.371.1670

O. M. Parkhi, A. Vedaldi, C. V. Jawahar, and A. Zisserman, The truth about cats and dogs, 2011 International Conference on Computer Vision, 2011.
DOI : 10.1109/ICCV.2011.6126398

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.371.1670

Z. Ren, J. Yuan, J. Meng, and Z. Zhang, Robust Part-Based Hand Gesture Recognition Using Kinect Sensor, IEEE Transactions on Multimedia, vol.15, issue.5, pp.1110-1120, 2013.
DOI : 10.1109/TMM.2013.2246148

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.695.7017

Z. Ren, J. Yuan, and Z. Zhang, Robust hand gesture recognition based on finger-earth mover's distance with a commodity depth camera, Proceedings of the 19th ACM international conference on Multimedia, MM '11, 2011.
DOI : 10.1145/2072298.2071946

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.471.7242

P. Savalle, S. Tsogkas, G. Papandreou, and I. Kokkinos, Deformable part models with cnn features, 3rd Parts and Attributes Workshop, ECCV, 2013.
URL : https://hal.archives-ouvertes.fr/hal-01109290

A. Shrivastava and A. Gupta, Building Part-Based Object Detectors via 3D Geometry, 2013 IEEE International Conference on Computer Vision, 2013.
DOI : 10.1109/ICCV.2013.219

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.644.5026

M. Sun and S. Savarese, Articulated part-based model for joint object detection and pose estimation, 2011 International Conference on Computer Vision, 2011.
DOI : 10.1109/ICCV.2011.6126309

E. Trulls, S. Tsogkas, I. Kokkinos, A. Sanfeliu, and F. Moreno, Segmentation-Aware Deformable Part Models, 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014.
DOI : 10.1109/CVPR.2014.29

URL : https://hal.archives-ouvertes.fr/hal-01109286

V. Ferrari, M. Marin-jimenez, and A. Zisserman, Progressive search space reduction for human pose estimation, 2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008.
DOI : 10.1109/CVPR.2008.4587468

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.321.2867

L. Wan, D. Eigen, and R. Fergus, End-to-end integration of a convolutional network, deformable parts model and nonmaximum suppression, pp.1411-5309, 2014.

X. Wang, M. Yang, S. Zhu, and Y. Lin, Regionlets for generic object detection, ICCV, 2013.
DOI : 10.1109/tpami.2015.2389830

W. Yang, Y. Wang, and G. Mori, Recognizing human actions from still images with latent poses, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010.
DOI : 10.1109/CVPR.2010.5539879

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.167.3890

Y. Yang and D. Ramanan, Articulated human detection with flexible mixtures of parts. TPAMI, pp.2878-2890, 2012.

Y. Zhu, R. Urtasun, R. Salakhutdinov, and S. Fidler, segdeepm: Exploiting segmentation and context in deep neural networks for object detection, CVPR, 2015.