(Professors S. Shankar Sastry and Michael I. Jordan)

We are considering the problem of evaluating the performance of Kalman filtering techniques on a sensor network. Communication delay and loss of data cannot be neglected in this new scenario. Starting from the discrete Kalman filtering formulation, making use of graphical models, we have made a first attempt to generalize the Kalman filter to take into account a model for the communication accross the network sensor.

(Professor S. Shankar Sastry)

Our research is part of an effort with the Mixed Initiative Control of Automata teams program (MICA). Our team works on developing agents that can deal with “intelligent,” as we term them, adversaries. The focus of my research is designing missions for unmanned aerial vehicles (UAVs). In particular, we are given that some enemy activity may take place in a specific region at a specific time in the future. My task is to provide algorithms for the helicopters to guide them into flying over that region in such a manner as to suppress enemy activity. These flight patterns vary greatly depending on the nature of the expected enemy activity. For example, some missions will require that the helicopters observe every part of a specific region as often as they possibly can. Other missions require the UAVs to move in such a way as to effectively “corner” a moving target. To successfully design such algorithms, we draw upon a wide range of resources, including game theory, network flow, and statistical learning theory.

(Professor S. Shankar Sastry)

(ONR) N00014-97-1-0946

The aim of this project is to build a navigation framework for an unmanned aerial vehicle, and to apply the result to the unmanned helicopter within the BEAR project. Navigation schemes that rely on a perfect knowledge of the environment are naturally unreliable. We consider this assumption unrealistic. We will divide the navigation problem into two tightly coupled subproblems: sensing and planning.

Sensing the environment is an essential requisite for any navigation problem. We'll make use of computer vision and gps/ins sensors and combine them in order to acquire a confident knowledge of the environment and of our motion in it. Sensor redundancy will also assess robustness with respect to fault detection and handling.

Motion planning, i.e., the capability of deciding what motion to execute in order to reach a goal, is fundamental to the design and realization of autonomous robots. A good motion planning algorithm must ensure safety, by avoiding collision and planning feasible paths, and efficiency (minimum travel time, etc.).

- [1]
- B. Sinopoli, M. Micheli, G. Donato, and T. K. Koo,
"Vision-based Navigation for an Unmanned Aerial Vehicle,"
*Int. Conf. Robotics and Automation,*Seoul, Korea, May 2001.

(Professor S. Shankar Sastry)

(NSF) ITR EIA-0204641

We study non-local two-qubit operations from a geometric perspective. By applying a Cartan decomposition to su(4), we find that the geometric structure of non-local gates is a 3-Torus. We derive the invariants for local transformations, and connect these local invariants to the coordinates of the 3-Torus. Since different points on the 3-Torus may correspond to the same local equivalence class, we use the Weyl group theory to reduce the symmetry. We show that the local equivalence classes of two-qubit gates are in one-to-one correspondence with the points in a tetrahedron except on the base. We then study the properties of perfect entanglers, that is, the two-qubit operations that can generate maximally entangled states from some initially separable states. We provide criteria to determine whether a given two-qubit gate is a perfect entangler and establish a geometric description of perfect entanglers by making use of the tetrahedral representation of non-local gates. We find that exactly half the non-local gates are perfect entanglers. We also investigate the non-local operations generated by a given Hamiltonian. We first study the gates that can be directly generated by a Hamiltonian. Then we explicitly construct a quantum circuit that contains at most three non-local gates generated by a two-body interaction Hamiltonian, together with at most four local gates generated by single qubit terms. We prove that such a quantum circuit can simulate any arbitrary two-qubit gate exactly, and hence it provides an efficient implementation of universal quantum computation and simulation.

- [1]
- Y. Makhlin, "Nonlocal Properties of Two-Qubit Gates and Mixed States and Optimization of Quantum Computations,"
*e-print quant-ph/0002045,*2000. - [2]
- N. Khaneja, R. Brockett, and S. J. Glaser, "Time Optimal Control in Spin Systems,"
*Physical Review A,*Vol. 63, 2001. - [3]
- B. Kraus and J. I. Cirac, "Optimal Creation of Entanglement Using a Two-Qubit Gate,"
*Physical Review A,*Vol. 63, 2001.

(Professor S. Shankar Sastry)

DARPA, (NSF) KDI ECS 9873474, and (ONR) MURI N00014-98-1-0671

Unmanned air vehicles (UAV) have been a very active area of research. Despite recent remarkable achievements obtained with fixed and rotary aircrafts [1], their use in many tasks is still limited by their maneuverability and size. However, the extraordinary flight capabilities of insects have inspired the design of small UAVs with flapping wings mimicking real flying insects. Their unmatched maneuverability, low fabrication cost, and small size make them very attractive for cost-critical missions in environments that are unpenetrable for larger size UAVs. In order to fabricate a micromechanical flying insect (MFI) eventually capable of stable flight, it is necessary to design and develop sensors capable of estimating the orientation of the insect. We are currently investigating the use of ocelli, a set of biologically inspired sensors, capable of estimating the orientation. The ocelli are four wide-angle photoreceptors placed on the head of the insect. They are oriented in such a way they collect light from different region of the sky (see Figure 1). The intensity of the light from the sky is a function of the latitude only, therefore, in principle, it is possible to estimate the orientation of the insect by opportunely combining the output of the four photoreceptors. Our goal is to use this information to design stabilizing controllers for hovering and maneuvering flight modes for an MFI.

Figure 1: Ocelli modeling: every photoreceptor collects light intensity from a specific area of the sky.

- [1]
- D. H. Shim, H. J. Kim, H. Chung, and S. Sastry, "Multi-Functional
Autopilot Design and Experiments for Rotorcraft-based Unmanned
Aerial Vehicles,"
*Digital Avionics Systems Conf.,*Daytona Beach, FL, October 2001.

(Professor S. Shankar Sastry)

(DARPA) BAA 01-06

In the past few years we have been witnessing the explosion of Internet and wireless networking. Computation and communication are becoming ubiquitous and affordable at very large scale. An enormous amount of information becomes readily available to assist any decision process. Centralized algorithms for computation of this information are doomed to fail because of their intrinsic lack of scalability. In this research we propose to develop a framework for the design of distributed algorithms for large wireless sensor networks and evaluate their performance with respect to some predefined metrics. In particular, we propose to integrate a wireless sensor network into the pursuit evasion games (PEGs) framework [1]. In this framework, a group of unmanned pursuers (ground robots and helicopters) monitors an unknown terrain searching for evaders. The pursuers cooperate via radio communication to catch the moving evaders. We propose to extend this framework by deploying, into the environment, a large scale wireless sensor network consisting of about one thousand nodes. Every node in this network is equipped with sensors capable of detecting an object moving nearby, and it is subject to tight power, bandwidth, and communication constraints. Any centralized algorithm to estimate the position and track the motion of the evaders would show poor scalability properties in this modified scenario. In our approach we propose alternative algorithms where computation is distributed among the nodes and then transmitted to the pursuers.

- [1]
- H. J. Kim, R. Vidal, D. H. Shim, O. Shakernia, and S. Sastry, "A
Hierarchical Approach to Probabilistic Pursuit-Evasion Games with
Unmanned Ground and Aerial Vehicles,"
*IEEE Conf. Decision and Control,*Orlando, FL, December 2001.

(Professor S. Shankar Sastry)

(ARO MURI) DAAH04-96-1-0341, (DARPA) F33615-98-C-3614, and (ONR MURI) N00014-98-1-0671

The goal of this work is the development of a software simulator (VIFS) for the flight of a micromechanical flying insect (MFI) inside a virtual environment. The final version of the simulator should eventually include several units to make it realistic: a model for the aerodynamics, a model for the dynamics, a model for the electromechanical architecture, a model for the sensors, and a set of flight control and path planning algorithms. We would like to add a graphical interface that would provide a visualization of the behavior of the MFI in the environment.

There are several reasons why it is useful to have a software simulator: it can (1) simulate insect body dynamics; (2) give a faster simulation of aerodynamics than scaled insect models; (3) be easily integrated with any elecromechanical and sensory system models; (4) provide insights to direct mechanical design, thus speeding up the fabrication of the real MFI; and (5) pre-train learning flight controllers, thus reducing the risk of damaging the real MFI.

The VIFS is decomposed into several units, each of them responsible of an independent task, as shown in Figure 1.

Figure 1: Virtual insect flight simulator (VIFS) architecture

- [1]
- L. Schenato, X. Deng, W. Chung Wu, and S. Sastry, "Virtual Insect Flight Simulator (VIFS): A Software Testbed for Insect Flight,"
*IEEE Int. Conf. Robotics and Automation,*Seoul, Korea, May 2001.

(Professor S. Shankar Sastry)

(ONR) N00014-00-1-0621 and (ONR) N00014-97-1-0946

Computer vision is gaining importance as a cheap, passive, and information-rich source complementing the sensor suite for control of unmanned aerial vehicles (UAV). A vision system on board a UAV typically augments a sensor suite that might include a Global Positioning System (GPS), inertial navigation sensors (INS), laser range finders, a digital compass, and sonar. Because of its structured nature, the task of autonomous landing is well-suited for vision-based state estimation and control and has recently been an active topic of research.

In our research [1], we have designed and implementated a real-time vision system for a rotorcraft unmanned aerial vehicle to estimate its state relative to a known landing target at 30 Hz. Our vision system consists of off-the-shelf hardware to perform image processing, segmentation, feature point extraction, camera control, and both linear and nonlinear optimization for model-based state estimation. Flight test results show our vision-based state estimates are accurate to within 5 cm in each axis of translation and 5 degrees in each axis of rotation, making it a viable sensor to be placed in the control loop of a hierarchical flight vehicle management system.

Recently [2], we developed a multiple view algorithm for vision-based landing of an unmanned aerial vehicle. Our algorithm is based on our recent results in multiple view geometry that exploit the rank deficiency of the so-called multiple view matrix. We show how the use of multiple views significantly improves motion and structure estimation. We compare our algorithm to our previous linear and nonlinear two-view algorithms using an actual flight test.

Furthermore, we have conducted vision-in-the-loop flight experiments and used the above vision system to successfully track a pitching landing deck which simulates the motion of a ship in 2 meter waves in 5 knot winds.

Figure 1: Berkeley UAV test-bed with on-board vision system: Yamaha R-50 helicopter with pan/tilt camera and computer box hovering above a pitching landing platform

Figure 2: Berkeley UAV test-bed with on-board vision system: Yamaha R-50 helicopter tracking a pitching landing platform, foreground shows the camera view of landing target

Figure 3: Vision monitoring station

- [1]
- C. S. Sharp, O. Shakernia, and S. S. Sastry, "A Vision System for Landing an Unmanned Aerial Vehicle,"
*IEEE Int. Conf. Robotics and Automation,*Seoul, Korea, May 2001. - [2]
- O. Shakernia, C. S. Sharp, R. Vidal, Y. Ma, and S. S. Sastry, "A Vision System for Landing an Unmanned Aerial Vehicle,"
*IEEE Int. Conf. Robotics and Automation,*Washington, DC, May 2002.

(Professor S. Shankar Sastry)

(ONR) N00014-00-1-0621

In [1], we present an algorithm for infinitesimal motion estimation from multiple central panoramic views. We first derive the optical flow equations for central panoramic cameras as a function of both pixel coordinates and back-projection rays. We then derive a rank constraint on the optical flows across many frames, which must lie in a six dimensional subspace of a higher-dimensional space. We then propose a factorization approach for recovering camera motion and scene structure.

In [2], we generalize this approach to the case where there are multiple independently moving objects. We show that one can apply a factorization technique to estimate the number of independently moving objects and segment the omnidirectional optical flow based on the fact that the flows generated by independently moving objects lie in orthogonal subspaces to each other.

We present experimental results of the proposed motion segmentation and estimation algorithm on a real image sequence with two independently moving mobile robots, and evaluate the performance of the algorithm by comparing the vision estimates with ground-truth GPS measurements gathered by the mobile robots.

Figure 1: Showing the optical flow of an omnidirectional camera viewing two independently moving mobile robots

Figure 2: Showing the multi-body motion segmentation of the mobile robots

- [1]
- O. Shakernia, R. Vidal, and S. Sastry, "Infinitesimal Motion Estimation from Multiple Central Panoramic Views,"
*IEEE Workshop on Vision and Motion Computing,*Orlando, FL, December 2002. - [2]
- O. Shakernia, R. Vidal, and S. Sastry, "Multi-Body Motion Estimation and Segmentation from Multiple Central Panoramic Views,"
*IEEE Conf. Robotics and Automation,*2003 (submitted).

(Professor S. Shankar Sastry)

(ONR) N00014-00-1-0621

We propose a simple analytic solution to the eigenvector segmentation problem. In the absence of noise, we derive a rank constraint, from which one can determine the number of groups *n* present in the scene. Once *n* has been determined, the segmentation of the image measurements can be obtained from the roots of a polynomial of degree *n* in one variable. Since the coefficients of the polynomial are computed by solving a linear system, we show that there is a unique global solution to the eigenvector segmentation problem, which can be obtained in closed form when n < 5.

This purely algebraic solution is shown to be robust in the presence of noise and can be used to initialize an optimal algorithm. We derive such an optimal objective function for the case of zero-mean Gaussian noise on the entries of the eigenvector. We then study the case of multiple eigenvectors and show that it can be reduced to a collection of polynomial segmentation problems.

We present experimental results on intensity-based and motion-based image segmentation. Our experiments show that our algorithm performs similarly or better than Kmeans and EM, and is at least eight times faster. This is because it only needs to solve one linear system in *n* variables plus one polynomial of degree *n* in one variable.

Figure 1: A comparison of Kmeans, EM, and Polysegment on intensity-based segmentation of the penguin image

Figure 2: Motion based segmentation of a sequence with two moving robots

- [1]
- R. Vidal, Y. Ma, and S. Sastry,
*Generalized Principal Component Analysis,*UC Berkeley Electronics Research Laboratory, Memorandum No. UCB/ERL M02/15, May 2002.

(Professor S. Shankar Sastry)

(ONR) N00014-00-1-0621

Modeling time series is a fundamental problem in many scientific and engineering disciplines. In particular, in computer vision the problem arises when one tries to characterize the dynamics of visual processes, for instance, the appearance of smoke, fire, foliage of trees, or walking gaits. All of these examples can be modeled as a linear hybrid system, that is, a system whose evolution is determined by a collection of linear models connected by switches among a number of discrete states.

In [1] and [2] we analyze the observability of the continuous and discrete states of a class of linear hybrid systems. We derive rank conditions that the structural parameters of the model must satisfy in order for filtering and smoothing algorithms to operate correctly. Our conditions are simple rank tests that exploit the geometry of the observability subspaces generated by the output of a linear hybrid system. We also derive weaker rank conditions that guarantee the uniqueness of the reconstruction of the state trajectory from a specific output, even when the hybrid system is unobservable. We also study the identifiability of the model parameters by characterizing the set of models that produce the same output measurements. Finally, when the data is generated by a model in the class, we give conditions under which the true model can be identified.

- [1]
- R. Vidal, A. Chiuso, and S. Soatto, "Observability and Identifiability of Jump Linear Systems,"
*Int. Conf. Decision and Control,*Las Vegas, NV, December 2002. - [2]
- R. Vidal, A. Chiuso, and S. Soatto, "Observability of Linear Hybrid Systems,"
*Workshop on Hybrid Systems Computation and Control,*2003 (submitted).

(Professor S. Shankar Sastry)

We study the multi-frame structure of motion problems when the camera translates on a plane with small baselines and arbitrary rotations. This case shows up in many practical applications, for example, in ground robot navigation. We consider the framework for small baselines presented in [1], in which a factorization method is used to compute the structure and motion parameters accurately, efficiently, and with guaranteed convergence. When the camera translates on a plane, the algorithm in [1] cannot be applied because the estimation matrix drops rank, causing the equations to no longer be linear. In this project, we show how to linearly solve those equations while preserving the accuracy, speed, and convergence properties of the non-planar algorithm. We evaluate the proposed algorithms on synthetic and real image sequences, and compare our results with those of the optimal algorithm. The proposed algorithms are very fast, accurate, have less than 0.3% outliers, work well for small-to-medium baselines and non-planar as well as planar motions.

- [1]
- J. Oliensis, "A Multi-Frame Structure-from-Motion Algorithm under Perspective Projection,"
*Int. J. Computer Vision,*Vol. 34, No. 2-3, 1999. - [2]
- R. Vidal and J. Oliensis, "Structure from Planar Motions with Small Baselines,"
*European Conf. Computer Vision,*Copenhagen, Denmark, May 2002.

(Professor S. Shankar Sastry)

(ONR) N00014-00-1-0621

In [1-3], we consider the problem of distributed formation control for nonholonomic mobile robots equipped with central-panoramic cameras. Our approach is to translate the formation control problem from the configuration space into separate visual servoing tasks in the image plane. We present an algorithm for infinitesimal multi-body motion estimation and segmentation from multiple central panoramic views. We derive a rank constraint on the optical flows across many frames, from which one can estimate the position and velocities of each leader in the paracatadioptric plane of the follower. We design a tracking controller for each mobile robot and show that there exist degenerate formation configurations due to the nonholonomy of the robot dynamics and the nonlinearity of the central-panoramic projection model. The control law naturally incorporates collision avoidance by using a navigation function that exploits the geometry of central-panoramic cameras. We present experimental results on multi-body motion estimation, segmentation of a real image sequence, and simulation results on vision-based control for many formation configurations.

Figure 1: A sample motion segmentation of mobile robots based on omni-directional vision.

Figure 2: Three mobile robots switching from a triangle to a line formation while avoiding collision.

- [1]
- R. Vidal, O. Shakernia, and S. Sastry, "Omni-Directional Vision-based Distributed Formation Control of Nonholonomic Mobile Robots,"
*IEEE Int. Conf. Robotics and Automation,*2003 (submitted). - [2]
- R. Vidal, O. Shakernia, and S. Sastry, "Distributed Formation Control with Omnidirectional Vision Based Motion Segmentation and Visual Servoing,"
*IEEE Magazine on Panoramic Robotics,*2002 (submitted). - [3]
- R. Vidal, O. Shakernia, and S. Sastry, "Omnidirectional Vision-based Formation Control,"
*Allerton Conf. Communication, Control, and Computing,*Monticello, IL, October 2002.

(Professor S. Shankar Sastry)

(ONR) N00014-00-1-0621

We present an algebraic geometric approach for segmenting both static and dynamic scenes from image intensities. We introduce the multibody affine constraint as a geometric relationship between the motion of multiple objects and the image intensities generated by them. This constraint is satisfied by all the pixels, regardless of the body to which they belong and regardless of depth discontinuities or perspective effects. We propose a polynomial factorization technique that estimates the number of affine motion models as well as their motion parameters in polynomial time. The factorization technique is used to initialize a nonlinear algorithm that minimizes the algebraic error defined by the multibody affine constraint. Our approach is based solely on image intensities, hence it does not require feature tracking or correspondences. It is therefore a natural generalization of the so-called direct methods in single-body structure from motion to multiple moving objects. We present simulation and experimental results that validate our approach.

- [1]
- R. Vidal and S. Sastry, "Segmentation of Dynamic Scenes from Image Intensities,"
*IEEE Workshop on Motion and Video Computing,*Orlando, FL, December 2002.

(Professor S. Shankar Sastry)

(ONR) N00014-00-1-0621

We study the problem of estimating the motion of independently moving objects observed by a moving perspective camera. We show that infinitesimal image measurements corresponding to independent motions lie on orthogonal six-dimensional subspaces of a higher-dimensional linear space. We propose a factorization algorithm that estimates the number of independent motions, the segmentation of the image points, and the motion of each object relative to the camera from a set of image points and their optical flows in multiple frames. We evaluate the proposed algorithm on synthetic and real image sequences.

Figure 1: Segmentation of two independent motions: the camera and the car.

- [1]
- R. Vidal, S. Soatto, and S. Sastry, "A Factorization Method for Multibody Motion Estimation and Segmentation,"
*Allerton Conf.,*Monticello, IL, October 2002.

(Professor S. Shankar Sastry)

(ONR) N00014-00-1-0621

We present a vision-based algorithm for the detection of multiple autonomous vehicles for a pursuit-evasion game scenario. Our algorithm computes estimates of the poses of multiple moving evaders from visual information collected by multiple moving pursuers, without previous knowledge of the segmentation of the image measurements or the number of moving evaders. We evaluate our algorithm in pursuit-evasion games with unmanned ground and aerial vehicles.

Figure 1: A pursuit-evasion game with two ground and one aerial pursuer and one ground evader.

- [1]
- R. Vidal and S. Sastry, "Vision-based Detection of Autonomous Vehicles for Pursuit-Evasion Games,"
*IFAC World Congress on Automatic Control,*Barcelona, Spain, July 2002.

(Professor S. Shankar Sastry)

(ONR) N00014-00-1-0621

We consider the problem of identifying *n* linear subspaces in a
K-dimensional linear space from a collection of sample points drawn from these subspaces.

In the absence of noise, we cast GPCA in an algebraic geometric framework in which the number of classes *n* becomes the degree of a certain polynomial and the classes themselves become the factors (roots) of such a polynomial. For subspaces of dimension k=K-1, we show that GPCA is equivalent to factoring homogeneous polynomials of degree *n* in K variables, and provide a polynomial time solution to this problem using linear algebraic techniques. For subspaces of arbitrary dimension k < K-1, we show that GPCA can be reduced to the case k = K'-1 by first projecting the data into a K'-dimensional subspace of R^{K}.

In the presence of noise, we cast GPCA as a constrained nonlinear least squares problem which minimizes the reprojection error subject to all mixture constraints. By converting this constrained problem into an unconstrained one, we obtain an optimal function from which the subspaces can be directly recovered using standard non-linear optimization techniques. We use the algebraic solution to initialize maximum likelihood algorithms based on nonlinear optimization or iterative algorithms such as EM.

GPCA can be applied to a variety of estimation problems in which the data comes simultaneously from multiple (approximately) linear models. We apply GPCA to the multibody 3D and affine structure from motion problem in computer vision, i.e., the problem of estimating the 3D motion of multiple moving objects from 2D imagery. We also apply GPCA to image segmentation and eigenvector segmentation.

Figure 1: Generalized principal component analysis: the case of 2 lines in the XY plane

- [1]
- R. Vidal, Y. Ma, and S. Sastry,
*Generalized Principal Component Analysis,*UC Berkeley Electronics Research Laboratory, Memorandum No. UCB/ERL M02/15, May 2002.

(Professor S. Shankar Sastry)

(ONR) N00014-00-1-0621

In [1,2] we present a geometric approach for the analysis of dynamic scenes containing multiple rigidly moving objects seen in two perspective views. Our approach exploits the algebraic and geometric properties of the so-called multibody epipolar constraint and its associated multibody fundamental matrix, which are natural generalizations of the epipolar constraint and of the fundamental matrix to multiple moving objects. We derive a rank constraint on the image points from which one can estimate the number of independent motions and linearly solve for the multibody fundamental matrix. We prove that the epipoles of each independent motion lie exactly in the intersection of the left null space of the multibody fundamental matrix with the so-called Veronese surface. We then show that individual epipoles and epipolar lines can be uniformly and efficiently computed by using a novel polynomial factorization technique. Given the epipoles and epipolar lines, the estimation of individual fundamental matrices becomes a linear problem. Then, motion and feature point segmentation are automatically obtained from either the epipoles and epipolar lines or the individual fundamental matrices.

In [3] we present an algebraic geometric approach for segmenting both static and dynamic scenes from image intensities. We introduce the multibody affine constraint as a geometric relationship between the motion of multiple objects and the image intensities generated by them. This constraint is satisfied by all the pixels, regardless of the body to which they belong and regardless of depth discontinuities or perspective effects. We propose a polynomial factorization technique that estimates the number of affine motion models as well as their motion parameters in polynomial time. The factorization technique is used to initialize a nonlinear algorithm that minimizes the algebraic error defined by the multibody affine constraint. Our approach is based solely on image intensities, hence it does not require feature tracking or correspondences. It is therefore a natural generalization of the so-called direct methods in a single-body structure from motion to multiple moving objects.

Figure 1: Segmentation of three independently moving objects

Figure 2: Segmentation of two moving robots

- [1]
- R. Vidal, Y. Ma, S. Soatto, and S. Sastry, "Two-View Multibody Structure from Motion,"
*Int. J. Computer Vision*(submitted). - [2]
- R. Vidal, S. Soatto, Y. Ma, and S. Sastry, "Segmentation of Dynamic Scenes from the Multibody Fundamental Matrix,"
*ECCV Workshop on Vision and Modeling of Dynamic Scenes,*Copenhagen, Denmark, May 2002. - [3]
- R. Vidal and S. Sastry, "Segmentation of Dynamic Scenes from Image Intensities,"
*IEEE Workshop on Vision and Motion Computing,*Orlando, FL, December 2002.

(Professor S. Shankar Sastry)

(ONR) N00014-00-1-0621

We present a geometric approach for the analysis of dynamic scenes containing multiple rigidly moving objects seen in two perspective views. Our approach exploits the algebraic and geometric properties of the so-called multibody epipolar constraint and its associated multibody fundamental matrix, which are natural generalizations of the epipolar constraint and of the fundamental matrix to multiple moving objects. We derive a rank constraint on the image points from which one can estimate the number of independent motions and linearly solve for the multibody fundamental matrix. We prove that the epipoles of each independent motion lie exactly in the intersection of the left null space of the multibody fundamental matrix with the so-called Veronese surface. We then show that individual epipoles and epipolar lines can be uniformly and efficiently computed by using a novel polynomial factorization technique. Given the epipoles and epipolar lines, the estimation of individual fundamental matrices becomes a linear problem. Then, motion and feature point segmentation is automatically obtained from either the epipoles and epipolar lines or the individual fundamental matrices.

- [1]
- R. Vidal, Y. Ma, S. Soatto, and S. Sastry, "Two-View Multibody Structure from Motion,"
*Int. J. Computer Vision*(submitted). - [2]
- R. Vidal, S. Soatto, Y. Ma, and S. Sastry, "Segmentation of Dynamic Scenes from the Multibody Fundamental Matrix,"
*ECCV Workshop on Vision and Modeling of Dynamic Scenes,*Copenhagen, Denmark, May 2002.