We compared our method to correspondence search using classic L2 norms, using normalized correlation, using a robust redescending norm (from , a Lorentzian with ), and using our RCS transform with . The L2 norm and normalized correlation yielded substantially similar results, and so for brevity we only show L2 results here.
First we note that in the majority of image locations, all three methods yield accurate results. It is only at points near discontinuities, and further at points where the discontinuity changes contrast sign between images, that there is a dramatic difference between RCS and the comparison methods. We will thus demonstrate performance in a disproportionate number of these cases (these are often critical locations for image analysis/synthesis tasks).
Figure 5 shows a comparison of correspondence values for a fingertip at various background locations (A,B,C), and a distractor region (D) of the hand. The table in Figure 5(h) shows that only the RCS method has correct performance: low distance measures for all the cases of correspondence between actual fingertips (A:B, A:C, B:C) and high distance for cases with the distractor (A:D, B:D, C:D).
Figures 6 and 7 show results from tracking 16 features simultaneously on image pairs of an eye, mouth, and fingers, and from comparing to hand-labeled ground truth. The mean coordinate error across the three images was 5.6 pixels for the L2 norm, 5.2 pixels for the redescending robust norm, and 0.97 pixels for the RCS method. The images were processed at 320x240 resolution. As expected, the L2 norm had difficulty at regions where substantial occlusion was present, and the redescending robust norm had problems where the designated correspondence was at a region of occlusion contrast sign reversal. At points where no occlusion was present the L2 and redescending norm had no coordinate error, but the RCS did return erroneous correspondences in approximately of points.
This lower performance of RCS away from occlusion boundaries is not surprising: When analyzing an image window of a single surface where brightness constancy holds (e.g., there is no occlusion) suboptimal performance results from downweighting portions of the window that are actually foreground. Informally, regions of high contrast that are prone to aliasing in the RCS representation can be detected by computing the sum of the radial cumulative similarity function, N: if that sum is below a certain threshold the RCS transform should be considered degenerate. Fortunately, occlusion-free regions of high contrast are cases where the traditional methods perform exceedingly well. We are currently implementing a hybrid algorithm which reverts to a L2 when that method yields good results. Alternatively a smoothing or regularization stage would also greatly alleviate this problem.