Vision Interface Seminar, Fall 2001

 (6.892) (H) Computer Vision for Interface and Surveillance: Algorithms and Implications

News (as of 11/2/01)

The internet will soon have eyes -- computer vision systems that can detect, track and recognize people and other objects. These systems will enable new perceptual interfaces between man and machine, including smart videoconferencing, expressive avatars, and rooms that recognize users and their gestures. They will allow the widespread tracking of people in outdoor spaces, with clear implications to notions of community, public safety, and privacy.

This class will survey the algorithms and techniques involved in vision-based perception of people, and discuss the privacy, freedom and safety implications of this new technology. We will discuss the questions of whether these goals must be mutually exclusive and under what conditions this technology empowers or constrains the individual user.

Topics:  Face Detection, Face Recognition, Appearance and Morphable models, Head Pose Estimation, Eye Gaze Tracking, Expression Recognition and FACS, Hand Tracking,
Condensation (particle filter) Trackers, Gesture Recognition, Kinematic Pose Estimation, Dynamic Body Tracking,  Outdoor Visual Surveillance, Indoor Tracking for Smart
Environments, Activity Description and Detection, Biometric Security issues,  Surveillance Privacy issues.

Qualifies as a subject in the Artificial Intelligence Engineering Concentration.

Prof. Trevor Darrell

Some previous coursework or research in Vision, Image Processing, or Graphics; Familiarity with Linear Algebra and basic Pattern Recognition methods; Permission of Instructor.

Class will meet Mondays and Wednesday from 1-2:30pm, in 36-839.

The course web site (this page) is
There will be three problem sets in this course.

This course will be taught in seminar style.   Students will be expected to actively participate in discussions, and to have prepared a question or comment about the assigned readings for the week.  Students will also be expected to present one technical paper during the term.

A project, study, or survey paper related to one of the major course topics is required.  A project will be expected to implement or extend a particular technique, a study will be expected to evaluate the effectiveness of a particular method for a user interface or recognition task, and a paper will be expected to provide a comprehensive review of a related area of literature.  A proposal document should outline the scope of the project and is due in early November.  A status presentation to the class is expected in late november, and a final report is due on the last day of classes.   Collaborative projects are encouraged, but the division of work between group members must be explicitly documented in the proposal and final reports.

Grading Summary:

Problem sets:  40%
Paper Presentation and Class Participation: 30%
Project Proposal, Presentation, and Final Report:  30%

Text:  Gong, McKenna and Psarrou, Dynamic Vision, Imperial College Press. ( amazon )


Wednesday 9/5/01 -- Lecture: Introduction and Overview   [ class notes ]

Background Material: read Gong et al. Chapter 1 & 2 for introduction to face analysis; for reference/review of machine learning techniques see Chapter 3 & Appendix C.

Monday 9/10/01:  Face Detection:  Color and contour cues.  Active illumination methods.  [ class notes ]

Gong et al., Chapter 4.

Birchfield, Elliptical Head Tracking Using Intensity Gradients and Color Histograms, CVPR98(232-237) [ web site ]

Morimoto, Koons, Amir and Flickner,  Pupil Detection and Tracking Using  Multiple Light Sources. 1998 PUI Workshop.  [ web ],

Wednesday 9/12/01: Face Detection:  Neural Network and Statistical methods.  [class notes ]
Gong et al., Chapter 5.

Rowley, Baluja, and Kanade, Neural Network-Based Face Detection, PAMI(20), No. 1, January 1998, pp. 23-38 [ abstract ] [ pdf ] [ local copy ]

Viola and Jones, Rapid Object Detection using a Boosted Cascade of Simple Features, To appear CVPR01. [  CRL Tech Report  local copy ]

Monday 9/17/01:  No class, Holiday.

Wednesday 9/19/01:  Face Recognition:  Eigenfaces.  Fisherfaces.  Elastic graph matching.  [ class notes ]

Gong et al. Chapter 8

Moghaddam B., Wahid W. and Pentland A., Beyond Eigenfaces: Probabilistic Matching for Face Recognition    International Conference on Automatic Face & Gesture Recognition, Nara, Japan, April 1998.    (TR #443)

Belhumeur, P.N., Hespanha, J.P., Kriegman, D.J., Eigenfaces vs. Fisherfaces: Recognition Using Class-Specific Linear Projection, PAMI(19), No. 7, July 1997, pp. 711-720. [ web ] [ local ]

Laurenz Wiskott, Jean-Marc Fellous, Norbert Krüger, and Christoph von der Malsburg, Face Recognition by Elastic Bunch Graph Matching, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 19, No. 7, July 1997 [ web ] [ local ]

Monday 9/24/01:Face Analysis/Synthesis: Morphable and active appearance models.   [ class notes ]
David Beymer and Tomaso Poggio Image Representations for Visual Learning  Science, 1996 June 28; 272 (5250)

Jones and Poggio, Multidimensional Morphable Models, ICCV98(683-688) [ web ]

Cootes, Edwards, Taylor, Active Appearance Models, Trans. PAMI June 2001 (Vol. 23, No. 6),  [ web ] [ local ]

Wednesday 9/26/01:   Privacy Issues in Face Biometrics
Daniel G. Dupont, "SEEN BEFORE: To guard against terrorism, the Pentagon looks to image-recognition technology", Scientific American, December 1999;

Sharath Pankanti, Ruud M. Bolle, and Anil Jain, Biometrics: The Future of Identification,  IEEE Computer Special Issue on Biometrics. Vol. 33 No. 2; February 1999

P. Jonathon Phillips, Alvin Martin, C.L. Wilson, and Mark Przybocki, An Introduction to Evaluating Biometric Systems,  IEEE Computer Special Issue on Biometrics. Vol. 33 No. 2; February 1999
(See also Facial Recognition Vendor Test 2000, at

Washington Post:  2/1/01 Police Video Cameras Taped Football Fans

Your Face Is Not a Bar Code:   Arguments Against Automatic Face Recognition in Public Places,  Phil Agre  7 September 2001.

Also see Phil Agre, RRE notes and recommendation 14; excerpt which begins with "Privacy Chernobyl";  from

Bruce Schneier, Biometrics: Uses and abuses, Inside Risks 110 CACM 42, 8, August 1999;   from

Reuters: 9/18/01  PluggedIn: Interest in face scanning grows after attacks

Any link from

Problem Set 1 released online Friday 9/28/01.  Due by 5pm, Friday 10/5/01.   (Graded and returned on 10/10/01).

Monday 10/1/01:   Face Pose

Gong, Ch. 6, [ local ]

La Cascia, M., Sclaroff, S., and Athitsos, V., Fast, Reliable Head Tracking under Varying Illumination: An Approach Based on Robust Registration of Texture-Mapped 3D Models, IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), 22(4), April,  2000. [ web ] [ local ]

 Ström, J., Jebara, T., Basu, S., Pentland, A.P., Real Time Tracking and Modeling of Faces: An EKF-based Analysis by  Synthesis Approach, Proceedings of the Modelling People Workshop at ICCV'99  [ web ] [ local ]

Heinzmann, J., Zelinsky, A., Robust Real-Time Face Tracking and Gesture Recognition, IJCAI97(1525-1530)  [ web ] [ local ]

Wednesday 10/3/01: Facial Expressions
Essa. I., and A. Pentland. "Coding, Analysis, Interpretation and Recognition of Facial Expressions.", IEEE Transactions on Pattern Analysis and Machine  Intelligence,Volume 19 (7), IEEE Computer Society Press, July, 1997 [ web ] [ local ]

Donato, G.L., Bartlett, M.S., Hager, J.C., Ekman, P., and Sejnowski, T.J. (1999). Classifying Facial Actions. IEEE Transactions on Pattern Analysis and Machine Intelligence   21(10) p. 974-989. [ web ] [ local ] (presentation by Jennifer)

Yingli Tian, T. Kanade and J. F. Cohn , " Recognizing Action Units for Facial Expression Analysis ", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 23, No. 2, February, 2001. [ web ] [ local ] (presentation by Ashish)

Monday 10/8/01:  No class, Holiday.

Wednesday 10/10/01:  No class, ID/entity event.

Monday 10/15/01:   Hand Tracking.

Gong, Ch 7-7.5. (pp. 125-136). , [ local ]

Michael Isard and Andrew Blake, CONDENSATION -- conditional density propagation for visual tracking   Int. J. Computer Vision, 29, 1, 5--28, (1998) [ web ] [ local ]

Ying Wu, John Lin, and Thomas S. Huang, "Capturing Hand Natural Articulation",ICCV 01, 2001. [ web ] [ local ]

Romer Rosales, Vassilis Athitsos, L. Sigal, Stan Sclaroff, "3D Hand Pose Reconstruction Using Specialized Mappings", ICCV 01, 2001 [ web ] [ local ]

Wednesday 10/17/01:   Gesture Recognition
Christian Vogler and Dimitris Metaxas. Parallel Hidden Markov Models for American Sign Language Recognition. International Conference on Computer Vision, Kerkyra, Greece, September 22-25, 1999. [ web ] [ local ]  (presentation by Jay)

 A. Wilson and A. Bobick , Parametric Hidden Markov Models for Gesture Recognition,  IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 21, No. 9, 1999 [ web ] [ local ]  (presentation by Amai)

 Vladimir I. Pavlovic, Rajeev Sharma, and Thomas S. Huang, "Visual interpretation of hand gestures for human-computer interaction: A review," IEEE Trans. on PAMI, July 1997. [ web ] [ local ]

Problem Set 2 released 10/19/01. Due by 5pm, Friday 10/26/01.   (Graded and returned on 10/31/01).
Monday 10/22/01: Articulated Bodies I
Plankers and Fua, Articulated Soft Objects for Video-based Bosy Modeling, ICCV 2001 [ web ] [ local ]  (presentation by Chuohao)

Delamarre and Faugeras, 3D Articulated Models and Multi-View Tracking with Silhouettes, ICCV 1999 [ web ] [ local ]  (presentation by Theresa)

Jojic, N., Turk, M., Huang, T., Tracking Self-Occluding Articulated Objects in Dense Disparity Maps, ICCV99(123-130) [ web ] [ local ]

Cham, T.J. and Rehg, J.M. A Multiple Hypothesis Approach to Figure Tracking, CVPR99(II:239-245) [ web ] [ local ]

Wednesday 10/24/01:  Articulated Bodies II

Bregler, C., Malik, J., Tracking People with Twists and Exponential Maps, CVPR98(8-15). [ web ]   [ web-ucb ]  [ local ] (presentation by Christian)
Yamamoto, M., Yagishita, K., Scene Constraints-Aided Tracking of Human Body, CVPR00 [ web ]  [ local ] (presentation by Christian)

Sidenbladh, H., Black, M., Fleet, D., Stochastic Tracking of 3D Human Figures using 2D Image Motion, ECCV00 [ web ] [ local ]

Monday 10/29/01:   Pedestrian detection and tracking
Wren, C., Azarbayejani, A., Darrell, T., and Pentland, P., "Pfinder: Real-Time Tracking of the Human Body ", IEEE Transactions on Pattern Analysis and Machine Intelligence, July 1997.  [ web ]

I. Haritaoglu, D. Harwood, and L. Davis. W4: real-time surveillance of people and their activities   IEEE Transactions on Pattern Analysis and Machine Intelligence,  Volume: 22 Issue: 8 , Aug. 2000  [ web ] [ local ]  [ web site ]  (presentation by Audrey)

L. Zhao and C. Thorpe, Stereo and Neural Network-based Pedestrian Detection, Proc. 1999 Int'l Conf. on Intelligent Transportation Systems, October, 1999, pp. 298-303.  [ web ]  (presentation by Sumita)


M. Oren, C. Papageorgiou, P. Sinha, E. Osuna, and T. Poggio. Pedestrian detection using wavelet templates. In Proc. Computer Vision and Pattern Recognition, pages 193--199, Puerto Rico, June 16--20 1997. [ web ]

Wednesday 10/31/01:   Vision-based vehicle tracking
Project proposal due by 5pm Friday 11/2/01
D. Beymer, P.F. McLauchlan, B. Coifman, and J. Malik. A real-time computer vision system for measuring traffic parameters. In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, 1997. [ web ]   (presentation by Josh)

F. Dellaert, D. Pomerleau, and C. Thorpe. Model-based car tracking integrated with a road-follower. In Proceedings of IEEE Conference on Robotics and Automation (ICRA), Leuven, Belgium, 1998. [ web ]

J. van Leuven, M.B. van Leeuwen, F.C.A. Groen, Real-time Vehicle Tracking in Image Sequences, IEEE Instrumentation and Measurement Technology Conference, Budapest, May 21-23, 2001.  [ local ] (presentation by Tim)

Monday 11/5/01:  Activity I
James W. Davis and Aaron F. Bobick,  The Recognition of Human Movement Using Temporal Templates IEEE Transactions on Pattern Analysis and Machine Intelligence,  Volume: 23 Issue: 3, March 2001  [ web ]   [ local ]   [ web site ]

Stauffer, C.,, Grimson, W.E.L.,  Learning patterns of activity using real-time tracking , IEEE Transactions on Pattern Analysis and Machine Intelligence,  Volume: 22 Issue: 8 , Aug. 2000 [ web ]

N. Johnson and D. C. Hogg. Learning the distribution of object trajectories for event recognition. Image and Vision Computing, 14:609--615, 1996.  [ web ]


N Johnson, A Galata, and D C Hogg. The acquisition and use of interaction behaviour models. In Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition - CVPR'98, pages 866--871. IEEE Computer Society Press., 1998.  [ web ]

Wednesday 11/7/01:  Activity II
Brand, M.  and Kettenaker, V., Discovery and Segmentation of Activities in Video.IEEE Transactions on Pattern Analysis and Machine Intelligence,  Volume: 22 Issue: 8 , Aug. 2000 [ web ]  (presentation by Jocelyn)

Y. A. Ivanov and A. F. Bobick. Recognition of visual activities and interactions by stochastic parsing. In IEEE Trans. on Pattern Analysis and Machine Intelligence,
volume 22(8), pages 852--872, 2000. [ local ]  [ web ]

Nuria Oliver, Barbara Rosario and Alex Pentland.  A Bayesian Computer Vision System for Modeling Human Interactions,  In IEEE Trans. on Pattern Analysis and Machine Intelligence, volume 22(8) 2000.  [ local ] [ web ]

Problem Set 3 released. Due by 5pm, Friday 11/16/01.

Monday 11/12/01  No Class, Holiday

Wednesday 11/14/01: No Class, UIST/PUI Conference.

Monday 11/19/01:   Perceptive Interfaces and Context

Gong Ch. 12

A. Pentland, Looking at People: Sensing for Ubiquitous and Wearable Computing, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 22, No. 1, January 2000 [ local ] [ web ]

Salber, D., Dey, A. K., Abowd, G. D.: The Context Toolkit: Aiding the Development of Context-Enabled Applications. In: the Proceedings of the 1999 Conference on Human  Factors in Computing Systems (CHI '99), Pittsburgh, PA, May 15--20, (1999), 434-441  [ web ]

 Thad Starner, Bernt Schiele, and Alex Pentland,  Visual contextual awareness in wearable computing. In Second International Symposium on Wearable Computers, Oct 1998. [ web ]

Wednesday 11/21/01:  "Smart spaces"

The Aware Home: A Living Laboratory for Ubiquitous Computing Research", Kidd et al., in the Proceedings of the Second International Workshop on Cooperative Buildings  [ web ]; also see the GaTech Aware Home web site (presentation by Kristina)

John Krumm et al., "Multi-Camera Multi-Person Tracking for EasyLiving", IEEE Workshop on Visual Surveillance, July 2000. (presentation by Kristin)  [web]

Bobick, A., Intille, S., Davis, J., Baird, F., Pinhanez, C., Campbell, L., Ivanov, Y., Schutte, A., & Wilson, A. (1996). The KidsRoom: A Perceptually-Based Interactive and Immersive Story Environment. Technical Report 398, M.I.T. Media Laboratory Perceptual Computing Section. [web]

Sumit Basu, Tanzeem Choudhury, Brian Clarkson, and Alex Pentland Analyzing Human Interactions in the Facilitator Room . IEEE International Workshop on Cues in Communication. In  conjunction with Computer Vision and Pattern Recognition 2001 [ web ]

Monday 11/26/01:  Project proposal presentations

Wednesday 11/28/01 & Monday 12/3/01:  Privacy Issues in Visual Surveillance

Tampa Scans the Faces in Its Crowds for Criminals, New York Times, July 4, 2001 [ local ]

'Digitize this' protesters tell surveillance cameras, USA Today, July 16, 2001  [ local text-only ]

Living Under an Electronic Eye, New York Times,  September 27, 2001 [ local ]
Jeffrey Rosen,  A Watchful State, New York Times Sunday Magazine, October 7, 2001 [ local ]

 ACLU Challenges Face Scanning at California Airport, Reuters, Tuesday November 20 2001 [ local ]

Nick Taylor, Closed Circuit Television:The British Experience : 1999 STAN. TECH. L. REV. VS 11 [ local ] (from  [ web ]);

Privacy International -- Video Surveillance 

Froomkin, The Death of Privacy,  (pp 1469-1479, 1501-1510, 1536-1543).  [ local ]

David Brin, The Transparent Socienty (excerpts provided); see also online reviews in Metro 2/6/97 [ web ], and hotwired 97/22 [ web ]

New York Surveillance Camera Players: Time in the Shadows of Anonymity: Fighting Against Surveillance Cameras, Transparency, and Global Capitalism

Wednesday 12/5/01 - Wednesday 12/12/01:  Project period.  No Class or readings. (NIPS/CVPR Conference.)
Project Final Report Due 5pm 12/12/01.