Skip to main content
Log in

A monocular system for person tracking: Implementation and testing

  • Published:
Journal on Multimodal User Interfaces Aims and scope Submit manuscript

Abstract

This paper presents a complete functional system capable of detecting people and tracking their motion in either live camera feed or pre-recorded video sequences. The system consists of two main modules, namely the detection and tracking modules. Automatic detection aims at locating human faces and is based on fusion of color and feature-based information. Thus, it is capable of handling faces in different orientations and poses (frontal, profile, intermediate). To avoid false detections, a number of decision criteria are employed. Tracking is performed using a variant of the well-known Kanade-Lucas-Tomasi tracker, while occlusion is handled through a re-detection stage. Manual intervention is allowed to assist both modules if required. In manual mode, the system can track any object of interest, so long as there are enough features to track. The system caters for calibrated cameras and can provide 3-D coordinates of any tracked object(s) of interest. It has been tested with very good results on a variety of video sequences, including a database of studio video sequences, for which 3-D ground truth data, originating from a 4-camera infrared tracking system, exist.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from €37.37 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price includes VAT (Netherlands)

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. I. Haritaoglu, D. Harwood, and L. S. Davis, “Ghost: a human body part labeling system using silhouettes”, inFourteenth International Conference on Pattern Recognition (ICPR98), vol. 1, (Vienna, Austria), pp. 77–82, August 1998, 31

  2. J. Han and B. Bhanu, “Detecting moving humans using color and infrared video”, inIEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI2003), (Tokyo, Japan), pp. 228–233, July 2003. 31

  3. A. Wu, M. Shah, and N. D. V. Lobo, “A virtual 3D blackboard: 3D finger tracking using a single camera”, inFourth IEEE International Conference on Automatic Face and Gesture Recognition (AFGR2000), (Grenoble, France), pp. 536–543, March 2000. 31

  4. Z. Duric, F. Li, Y. Sun, and H. Wechsler, “Using normal flow for detection and tracking of limbs in color images”, inSixteenth International Conference on Pattern Recognition (ICPR2002), vol. 4, (Quebec, Canada), pp. 268–271, August 2002. 31

  5. M.-H. Yang, D. J. Kriegman, and N. Ahuja, “Detecting Faces in Images: A survey”,IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 1, pp. 34–58, 2002. 31

    Article  Google Scholar 

  6. E. Hjelmas and B. K. Low, “Face Detection: A survey”,Computer Vision and Image Understanding, vol. 83, pp. 236–274, 2001. 31

    Article  MATH  Google Scholar 

  7. G. Stamou, M. Krinidis, E. Loutas, N. Nikolaidis, and I. Pitas, “2D and 3D Motion Tracking in Digital Video”, inHandbook of Image and Video Processing (A. C. Bovik, ed.), Academic Press, 2005. 31

  8. T. B. Moeslund, A. Hilton, and V. Krüger, “A Survey of Advances in Vision-Based Human Motion Capture and Analysis”,Computer Vision and Image Understanding, vol. 104, no. 2-3, pp. 90–127, 2006. 31

    Article  Google Scholar 

  9. D. M. Gavrila, “The Visual Analysis of Human Movement: A Survey”,Computer Vision and Image Understanding, vol. 73, no. 1, pp. 82–98, 1999. 31

    Article  MATH  Google Scholar 

  10. J. K. Aggarwal and Q. Cai, “Human Motion Analysis: A Review”,Computer Vision and Image Understanding, vol. 73, no. 3, pp. 428–440, 1999. 31

    Article  Google Scholar 

  11. C. Wren, A. Azarbayejani, T. Darrell, and A. Pentland, “PFinder: Real-Time Tracking of the Human Body”,IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 780–785, 1997. 31

    Article  Google Scholar 

  12. O. Bernier, M. Collobert, R. Feraud, V. Lemaire, J. E. Viallet, and D. Collobert, “MULTRAK: A system for automatic multiperson localization and tracking in real-time”, inFifth IEEE International Conference on Image Processing (ICIP98), vol. 1, (Chicago, United States), pp. 136–140, October 1998. 31

  13. A. Colmenarez, B. Frey, and T. S. Huang, “Detection and tracking of faces and facial features”, inSixth IEEE International Conference on Image Processing (ICIP99), vol. 1, (Kobe, Japan), pp. 657–661, October 1999. 31

  14. L. L. Yang and M. A. Robertson, “Multiple-face tracking system for general region-of-interest video coding”, inSeventh IEEE International Conference on Image Processing (ICIP2000), vol. 1, (Vancouver, Canada), pp. 347–350, September 2000. 31

  15. I. Haritaoglu, D. Harwood, and L. S. David, “W4: Real-Time Surveillance of People and Their Activities”,IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 809–830, 2000. 32

    Article  Google Scholar 

  16. M. Krinidis, N. Nikolaidis, and I. Pitas, “2D Feature-Point Selection and Tracking Using 3-D Physics-Based Deformable Surfaces”,IEEE Transactions on Circuits and Systems for Video Technology, vol. 17, pp. 876–888, July 2007. 32

    Article  Google Scholar 

  17. K. Sobottka and I. Pitas, “Looking for Faces and Facial Features in Color Images”,Pattern Recognition and Image Analysis: Advances in Mathematical Theory and Applications, Russian Academy of Sciences, vol. 7, no. 1, pp. 124–137, 1997. 32, 33

    Google Scholar 

  18. P. Viola and M. J. Jones, “Robust Real-time Object Detection”, Tech. Rep. 01, Cambridge Research Laboratory, 2001. 32, 33, 37

  19. C. Tomasi and T. Kanade, “Shape and Motion from Image Streams: a Factorization Method—Part 3 Detection and Tracking of Point Features”, Tech. Rep. 91-132, Computer Science Department, Carnegie Mellon University, 1991. 32, 35, 36, 37

  20. J. Shi and C. Tomasi, “Good Features to Track”, inIEEE International Conference on Computer Vision and Pattern Recognition (CVPR94), (Seattle, United States), pp. 593–600, June 1994. 32, 35, 36, 37

  21. C. Terrillon, M. David, and S. Akamatsu, “Automatic Detection of Human Faces in Natural scene Images by Use of a Skin Color Model and Invariant Moments”, inThird IEEE International Conference on Automatic Face and Gesture Recognition (AFGR98), (Nara, Japan), pp. 112–117, April 1998. 32

  22. A. Saber and A. Tekalp, “Frontal-View Face Detection and Facial Feature Extraction Using Color, Shape and Symmetry Based Cost Functions”,Pattern Recognition Letters, vol. 17, no. 8, pp. 669–680, 1998. 32

    Article  Google Scholar 

  23. S. Tsekeridou and I. Pitas, “Facial Feature Extraction in Frontal Views using Biometric Analogies”, inIX European Signal Processing Conference (EUSIPCO98), vol. 1, (Rhodes, Greece), pp. 315–318, September 1998. 32

  24. H. Graf, E. Cosatto, D. Gibbon, M. Kocheisen, and E. Petajan, “Multimodal System for Locating Heads and Faces”, inSecond IEEE International Conference on Automatic Face and Gesture Recognition (AFGR97), (Killington, VT), pp. 41–46, October 1996. 32

  25. K. Yow and R. Cipolla, “Locating Human Faces in Photographs”,Image and Vision Computing, vol. 15, no. 9, pp. 713–735, 1996. 32

    Article  Google Scholar 

  26. V. Govindaraju, “Feature-Based Human Face Detection”,International Journal of Computer Vision, vol. 19, no. 2, pp. 129–146, 1996. 32

    Article  MathSciNet  Google Scholar 

  27. A. Samal and P. Iyengar, “Human Face Detection Using Silhouettes”,International Journal of Pattern Recognition and Artificial Intelligence, vol. 9, no. 6, pp. 845–867, 1995. 32

    Article  Google Scholar 

  28. J. Miao, B. Yin, K. Wang, L. Shen, and X. Chen, “A Hierarchical Multiscale and Multiangle System for Human Face Detection in a Complex Background Using Gravity-Center Template”,International Journal of Pattern Recognition, vol. 32, no. 7, pp. 1237–1248, 1999. 32

    Article  Google Scholar 

  29. M. J. Jones and P. Viola, “Fast Multi-view Face Detection”, Tech. Rep. 96, Mitsubishi Electric Research Laboratories, 2003. 32, 38

  30. H. Rowley, S. Baluja, and T. Kanade, “Rotation Invariant Neural Network-Based Face Detection”, inIEEE International Conference on Computer Vision and Pattern Recognition (CVPR98), (Santa Barbara, CA, United States), pp. 38–44, June 1998. 32

  31. H. Schneiderman and T. Kanade, “Probabilistic Modeling of Local Appearance and Spatial Relationships for Object Recognition”, inIEEE International Conference on Computer Vision and Pattern Recognition (CVPR98), (Santa Barbara, CA, United States), pp. 45–51, June 1998. 32

  32. K. Mikolajczyk, R. Choudhury, and C. Schmid, “Face detection in a video sequence—a temporal approach”, inIEEE International Conference on Computer Vision and Pattern Recognition (CVPR2001), vol. 2, (Kauai, Hawaii), pp. 96–101, December 2001. 32

  33. B. D. Zarit, B. J. Super, and F. K. H. Quek, “Comparison of Five Color Models in Skin Pixel Classification”, inICCV99 International Workshop on Recognition. Analysis, and Tracking of Faces and Gestures in Real-Time Systems (RATFG-RTS99), (Corfu, Greece), pp. 58–63, September 1999. 32

  34. B. Martinkauppi, M. Soriano, and M. Laaksonen, “Behavior of skin color under varying illumination seen by different cameras in different color spaces”, inMachine Vision Applications in Industrial Inspection IX. Proceedings of SPIE (M. Hunt, ed.), vol. 4301, (San Jose California, USA), pp. 102–113, January 2001. 32

  35. V. Vezhnevets, V. S. V, and A. Andreeva, “A Survey on Pixel-Based Skin Color Detection Techniques”, inInternational Conference on Computer Graphics between Europe and Asia (GRAPHICON-2003), (Moscow, Russia), September 2003. 32

  36. A. Fitzgibbon and R. Fisher, “A Buyer’s Guide to Conic Fitting”, inFifth British Machine Vision Conference (BMVC99), (Birmingham, UK), pp. 513–522, 1995. 33

  37. R. Lienhart and J. Maydt, “An Extended Set of Haar-Like Features for Rapid Object Detection”, inIEEE International Conference on Image Processing (ICIP02), (Rochester, New York, USA), pp. 900–903, September 2002. 33, 34

  38. E. Loutas, K. Diamantaras, and I. Pitas, “Occlusion resistant object tracking”, inIEEE International Conference on Image Processing (ICIP01), vol. 2, (Thessaloniki, Greece), pp. 65–68, October 2001. 35

  39. Z. Zhang, “Flexible Camera Calibration by Viewing a Plane from Unknown Orientations”, inSeventh IEEE International Conference on Computer Vision (ICCV99), vol. 1, (Corfu, Greece), pp. 667–673, September 1999. 37

  40. S. J. Maybank and O. D. Faugeras, “A theory of selfcalibration of a moving camera”,The International Journal of Computer Vision, vol. 8, no. 2, pp. 123–152, 1992. 37

    Article  Google Scholar 

  41. Q.-T. Luong and O. Faugeras, “Self-calibration of a moving camera from point correspondences and fundamental matrices”,The International Journal of Computer Vision, vol. 22, no. 3, pp. 261–289, 1997. 37

    Article  Google Scholar 

  42. E. Trucco and A. Verri,Introductory Techniques for 3-D Computer Vision. Prentice Hall, 1998. 37

  43. S. Pingali and J. Segen, “Performance Evaluation of People Tracking Systems”, inThird IEEE Workshop on Applications of Computer Vision (WACV96), (Sarasota, Florida, USA), pp. 33–38, December 1996. 37

  44. M. Krinidis, G. Stamou, H. Teutsch, S. Spors, N. Nikolaidis, R. Rabenstein, and I. Pitas, “An Audio-Visual Database For Evaluating Person Tracking Algorithms”, inProceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2005), (Philadelphia), March 2005. 38, 39

  45. “Commission of the European Communities, IST project CARROUSO (Creating, Assessing and Rendering in Real Time of High Quality Audio-Visual Environments in MPEG-4 Context)”. http://www.emt.iis. fraunhofer.de/projects/carrouso/.38

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Georgios N. Stamou.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Stamou, G.N., Krinidis, M., Nikolaidis, N. et al. A monocular system for person tracking: Implementation and testing. J Multimodal User Interfaces 1, 31–47 (2007). https://doi.org/10.1007/BF02910057

Download citation

  • Issue date:

  • DOI: https://doi.org/10.1007/BF02910057

Keywords

Profiles

  1. Nikos Nikolaidis