Automated, Scalable, Airborne Only, 3D Modeling of Urban Environments
Avideh Zakhor and Min Ding
Textured 3D city models are needed in many applications such as city planning, 3D mapping, and photo-realistic fly and drive-throughs of urban environments. While combined ground/aerial modeling results in photorealistic modeling suitable for both virtual walkthrough and drive-throughs, generating such models is extremely time consuming in that every street needs to be driven through and mapped properly. In contrast, airborne-only modeling can potentially be done extremeley efficiently since flying over a region is much faster than driving within it. As such, airborne-only modeling could result in fast, scalable 3D mapping of major urban areas in the globe.
3D model geometries are typically generated from stereo aerial photographs or range sensors such as Lidars. Fast, automated mapping of detailed aerial textures on the 3D geometry models is a challenging problem and the subject of this project. Traditionally, this has been done by manual correspondence between landmark features in the 3D model and the 2D imagery via a human operator. This approach is extremely time-consuming and does not scale to large regions. Frueh et al.  developed an approach in 2004 which would start with a coarse GPS/INS readout and refine it via an exhaustive search approach. This technique would take approximately 20 hours per image on today's computers.
In this project, we have developed a scheme which can automatically register aerial imagery onto 3D geometry models in a matter of minutes instead of hours. We start with a coarse GPS/INS readout and then refine the pose by applying a series of algorithms including: (a) vanishing point detection; (b) 2D corner detection and correspondence between 3D model and 2D imagery; (c) Hough transform to prune possible matches; (d) generalized RANSAC algorithm to find in-lier matches; and (e) Lowe's algorithm to compute the 3D camera pose. There are a number of innovations in this project: First is a new vanishing point detection algorithm for aerial images of complex urban scenes. The detected vanishing points provide coarse camera pose estimate and extract inherently 3D information from 2D imagery. The second is feature point, 2D corner, detected from images based on the 3D information captured by vanishing points. 2D corners are used in point correspondences for camera pose refinement. The last innovation is the overall system design and flow of the algorithmic steps required to solve this problem. By taking advantage of the parallelism and orthogonality inherently present in man-made structures, our system is able to apply well-justified algorithms to provide a fast and truly automated camera registration solution for texture mapping. Our proposed system can achieve 91% accuracy in recovering camera pose for 90 oblique aerial imagery over the downtown Berkeley area. The accuracy of the system is around 50% for residential areas, and is our current topic of investigation.
Figure 1: An example of a residential area in downtown Berkeley which has been texture mapped with 8 airborne pictures on top of 3D geometry obtained via 1/2 meter resolution airborne lidar data
- C. Früh, R. Sammon, and A. Zakhor, "Automated Texture Mapping of 3D City Models With Oblique Aerial Imagery" 2nd International Symposium on 3D Data Processing, Visualization, and Transmission, Thessaloniki, Greece, September 2004, pp. 396-403.