Next: Background Estimation Up: Background estimation and removal Previous: Background estimation and removal

Motivation

Separating dynamic objects, such as people, from a relatively static background scene is a very important preprocessing step in many computer vision applications. Accurate and efficient background removal is critical for interactive games[7], person detection and tracking[1,4], and graphical special effects. One of the most common approaches to this problem is color or greyscale background subtraction. Typical problems with this technique include foreground objects with some of the same colors as the background (produce holes in the computed foreground), and shadows or other variable lighting conditions (cause inclusion of background elements in the computed foreground).

In this paper we present a passive method for background estimation and removal based on the joint use of range and color which produces superior results than can be achieved with either data source alone. This approach is now practical for general applications as inexpensive real-time passive range data is becoming more accessible through novel hardware[10] and increased CPU processing speeds. The joint use of color and range produces cleaner segmentation of the foreground scene in comparison to the commonly used color-based background subtraction or range-based segmentation.

**Figure:** Color background subtraction has difficulty when portions of the foreground include the same colors as the background. Top left shows color background model, top right shows color image from scene. The bottom image shows segmentation results from comparison of these images. The range background model and image are also shown for reference, although they are not used in this segmentation.
$\begin{figure} \onefigw{/home/gaile/text/Figures/BackSeg/lunchbackcolor.ps}{1.5i... ...{1.5in} \onefigw{/home/gaile/text/Figures/BackSeg/for2color.ps}{2in}\end{figure}$

Background subtraction based on color or intensity is a commonly used technique to quickly identify foreground elements. In current systems [3,4,11] performance is improved by using statistical models to represent the background (e.g single or multiple Gaussians at each pixel), as well as updating these models over time to account for slow changes. There are two classic problems with this approach. Clearly, if regions of the foreground contain similar colors as the background, they can be erroneously removed. Also, shadows cast on the background can be erroneously selected as foreground. This problem can be minimized by computing differences in a color space (hue, log color opponent, intensity normalized RGB[11]) which is less sensitive to intensity change. However, it is difficult to optimize a single match criterion such that it allows most shadowed pixels to match their normal background color and does not allow regions of the true foreground to match background pixels with similar hue. Figure 1 shows an example of color based segmentation failure.

Range has also been used for background removal[2,5,6]. The main issue in this approach is that depth computation via stereo, which relies on finding correspondences between two images, does not produce valid results in low contrast regions or in regions which can not be seen in both views. In our stereo implementation (described in section 2.1), these low confidence cases are detected and marked with a special value we will refer to as invalid . It is rare that all pixels in the scene will have valid range on which to base a segmentation decision. It is also difficult to use range data to segment foreground objects which are at approximately the same distance as the background. Figure 2 shows an example of range based segmentation failure.

**Figure:** Middle images show range background model and new scene image. Stereo computation can not produce valid range estimates in areas which have very low texture (e.g. saturated regions) or which are occluded in one view. Invalid range values are shown in white. Depth based segmentation, shown in bottom image, will fail in regions of the foreground which are undefined in depth. Top row shows color background model and scene image for reference, although they are not use in segmentation. Color of the foreground is overlayed on the segmentation results for easier interpretation.
$\begin{figure} \twofigw{/home/gaile/text/Figures/BackSeg/backTrackColorH.ps}{/ho... ....5in} \onefigw{/home/gaile/text/Figures/BackSeg/for506depth.ps}{2in}\end{figure}$

We present a scheme which takes advantage of the strengths of each data source for background modeling and segmentation. Background estimation is based on a multidimensional (range and color) mixture of Gaussians which can be performed for sequences containing substantial foreground elements. Segmentation of the foreground is performed via background comparison in range and normalized color. For optimal performance, we find we must explicitly take into account low confidence values in range and color, as well as shadow conditions. The background estimation is described in section 2, followed by the segmentation method in section 3.

Next: Background Estimation Up: Background estimation and removal Previous: Background estimation and removal

G. Gordon, T. Darrell, M. Harville, J. Woodfill."Background estimation and removal based on range and color,"Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, (Fort Collins, CO), June 1999.