Model-driven Compression of 3D Tele-Immersion Data
Jyh-Ming Lien and Ruzena Bajcsy
Data captured by a Tele-Immersion (TI) system can be very large. Therefore, compression of the TI data, i.e., a stream of 3D point clouds with color information, is critical to ensure real-time transmission. The main idea of this work is to compress point clouds representing human-like shape obtained from the TI system using the skeleton fitting and skeleton extraction techniques. Therefore, instead of transmitting the point clouds, we can simply transmit the skeleton. Then, the receiver in a remote site can reconstruct the 3D points by using the skeleton and the first view.
This approach is based on the assumption that the points move under rigid body transformations along with their associated links. When the truth deviates from this assumption, such as muscle movements and hair or cloth movements, we can encode and compress the prediction residuals, i.e., the deviations from the rigid link movements, to get better or even lossless reconstruction. In most cases, such a deviation is small, e.g., muscle movements and small cloth deformation.
Note that, in addition to the application in compression, other applications of this real-time skeletonization technique include human activity recognition and analysis, vision-based animation control, and task collaboration in a virtual world.
Related work: Skeletonization and compression of either static or dynamic point cloud data are not new. However, there are several difficulties that have not yet been addressed in the literature, thus making this problem (compressing TI data via skeletonization) very challenging. The key difficulties include:
- real-time computation;
- no data accumulation allowed; and
- no correspondence information, which may not be computable.
Our approach: Our approach to address these challenges is to allow a computationally more expensive (may not need to be real-time) initialization to generate an accurate skeleton of the subject and then a real-time tracking method will fit this extracted skeleton to the point clouds captured from the rest of the movements. Several methods exist and can give us an initial skeleton to start the process. As we mentioned previously, there is no known method that tracks moving point clouds with no given correspondence in real time. To track the whole body movement in real time, we use an extended version of Iterative Closest Point (ICP) algorithm and we have gotten high quality tracking results within the given time constraint (10~20 frames per second) for point clouds with a few thousand points. The tracking results of Tai Chi motion are shown in the figures below.
ArtICP takes an initial skeleton as input and computes a segmentation of the point cloud from the first frame. We segment the point cloud using the skeleton's distance field. For each part in the segmentation, we compute the rigid transformation to the rest of the frames. As most of the ICP variants, ArtICP is also composed of three key steps, i.e., finding correspondences, computing optimal transformations, and applying the transformations. Note that ArtICP not only applies the transformation to the link under consideration, but also applies the transformation to the child links to take advantage of the skeleton hierarchy, i.e., the child links generally move with their parents. If the child links do not follow the parent link's movement, the movements of the child links are generally constrained and are easy to track.
Figure 1: Top: Animated point clouds; Bottom: Skeletons fitted to the point clouds