1 00:00:00,150 --> 00:00:04,600 High resolution surface reconstruction from overlapping multiple-views 2 00:00:04,650 --> 00:00:08,500 by Nader Salman and Mariette Yvinec from the Geometrica team-project in INRIA. 3 00:00:09,250 --> 00:00:19,680 Three dimensional reconstruction of real world scenes from sequences of overlapping images is a topic of much interest in Computer Vision, Computer Graphics and Computational Geometry. 4 00:00:19,750 --> 00:00:23,450 Its application areas range from preservation of historical heritage 5 00:00:23,550 --> 00:00:30,000 to virtual reality walkthroughs through computer animation, movie production and e-commerce. 6 00:00:31,150 --> 00:00:38,550 The challenge is to recover simplicial models of complex scenes from a set of images taken from various viewpoints around the scene. 7 00:00:38,600 --> 00:00:44,000 The models sought after are either surface meshes or 3D meshes of objects in the scene. 8 00:00:44,850 --> 00:00:49,200 The first step of the process consists of using a stereo algorithm 9 00:00:49,300 --> 00:00:51,400 to extract a set of 3D points from the images. 10 00:00:51,450 --> 00:00:53,100 To speed up the extraction phase, 11 00:00:53,200 --> 00:00:56,300 the GYROVIZ project, produces sequences of located images 12 00:00:56,400 --> 00:01:01,000 describing for each image, the position of the camera as well as the shooting parameters. 13 00:01:01,150 --> 00:01:16,500 Most matchmoving softwares require highly coherent image sequences and generate sparse point clouds. 14 00:01:16,700 --> 00:01:32,000 The software of [ENPC/Certis-CSTB/Mod-EVE] we use requires located frames and generates quasi dense point clouds. 15 00:01:32,150 --> 00:01:36,350 In both cases the output of the image processing step is a set of tracks, 16 00:01:36,450 --> 00:01:41,500 where each track is a 3D point associate to a list of images where the point is visible. 17 00:01:42,150 --> 00:01:46,150 The set of images and tracks constitute the input of our reconstruction algorithm. 18 00:01:46,900 --> 00:01:48,350 The extracted tracks are not perfect, 19 00:01:48,400 --> 00:01:54,500 they include redundancies, outliers and noise. 20 00:01:54,900 --> 00:02:02,000 First, some multiple tracks correspond to the same 3D feature point and must be merged. 21 00:02:02,850 --> 00:02:06,850 As tracks are coming from characteristic points detected in the images, 22 00:02:06,950 --> 00:02:10,500 their 3D locations should be densely spread over the surface of the scene objects. 23 00:02:10,600 --> 00:02:16,500 Thus, we detect outliers by computing the average squared distance to the k-nearest neighbors for each track. 24 00:02:17,000 --> 00:02:20,750 Another criterion is based upon the camera positions. 25 00:02:20,800 --> 00:02:24,150 For each track P we compute the smallest cone with apex at P 26 00:02:24,200 --> 00:02:26,800 that contains all cameras C that observed point P. 27 00:02:26,900 --> 00:02:31,150 Tracks with small aperture cone angles are discarded. 28 00:02:32,00 --> 00:02:35,600 The final preprocessing step is straightforward 29 00:02:35,650 --> 00:02:47,150 as it smoothes the remaining tracks using a jet-fitting and reprojection algorithm. 30 00:02:47,700 --> 00:02:52,800 At this stage the goal is to extract from the preprocessed tracks and the input images, 31 00:02:52,900 --> 00:02:57,100 a set of higher level primitives conforming to the surface of the objects in the scene. 32 00:02:57,200 --> 00:03:03,150 The final reconstruction sought after must respect the sharp creases of the scene. 33 00:03:03,950 --> 00:03:10,100 To detect those edges we perform into the input images a simple contour extraction algorithm 34 00:03:10,150 --> 00:03:12,800 based on Frei-Chen edge convolution masks. 35 00:03:12,950 --> 00:03:15,850 We will use the detected contours to create in each image 36 00:03:15,950 --> 00:03:17,150 a constrained Delaunay triangulation. 37 00:03:18,000 --> 00:03:20,600 Let us first define the constraints. 38 00:03:20,700 --> 00:03:24,500 In each image, we consider the projection of the tracks observed in this image, 39 00:03:24,550 --> 00:03:26,850 and perform a classification of these tracks 40 00:03:26,950 --> 00:03:29,700 depending on their proximity to the extracted contours. 41 00:03:29,900 --> 00:03:33,550 We now consider the complete graph over all the contour tracks 42 00:03:33,650 --> 00:03:47,150 and select the edges of this graph which lay over the detected contours. 43 00:03:49,150 --> 00:03:54,900 At this stage the selected edges may not form a correct input for a constrained 2D triangulation. 44 00:03:55,050 --> 00:03:56,500 To avoid collinear segments, 45 00:03:56,600 --> 00:04:00,400 when two selected edges form a small angle we eliminate the longest one. 46 00:04:00,600 --> 00:04:01,950 Now, to avoid intersections, 47 00:04:02,000 --> 00:04:05,300 when two selected edges intersect we eliminate the longest one. 48 00:04:05,400 --> 00:04:09,000 The remaining edges are used as input segments to 49 00:04:09,050 --> 00:04:11,350 a 2D constrained Delaunay triangulation 50 00:04:11,400 --> 00:04:15,150 with all projected tracks as vertices. 51 00:04:16,500 --> 00:04:17,900 The triangles of this triangulation are then 52 00:04:17,950 --> 00:04:21,050 lifted in 3D space using the 3D coordinates 53 00:04:21,100 --> 00:04:25,500 computed by the stereo algorithm. 54 00:04:26,000 --> 00:04:29,000 This process is repeated for each image. 55 00:04:29,650 --> 00:04:33,550 The output is a 3D triangle soup in which the vast majority of the triangles correspond to the 56 00:04:33,600 --> 00:04:35,900 actual surface regions in the scene. 57 00:04:36,000 --> 00:04:38,550 However, at occlusion boundaries the present scheme may produce 58 00:04:38,600 --> 00:04:41,350 triangles which connect foreground to background surfaces. 59 00:04:41,650 --> 00:04:42,950 Two filtering steps are applied to the created triangle soup to 60 00:04:43,050 --> 00:04:44,700 further enforce visibility consistency 61 00:04:44,750 --> 00:04:47,750 and to remove erroneous triangles. 62 00:04:47,500 --> 00:04:50,800 If a triangle belongs to the final surface then 63 00:04:50,850 --> 00:04:53,300 it should not be intersected by any sight segment 64 00:04:53,350 --> 00:05:05,550 joining a track 3D position to one of the cameras observing this track. 65 00:05:05,900 --> 00:05:08,000 We now filter the big triangles; 66 00:05:08,100 --> 00:05:11,200 where the longest edge is more than a given fraction of the bounding box. 67 00:05:11,300 --> 00:05:14,550 Among these triangles we distinguish anamorphic and well-shaped triangles. 68 00:05:14,650 --> 00:05:15,900 Anamorphic triangles 69 00:05:15,950 --> 00:05:18,200 are likely to lie in free space and should be discarded. 70 00:05:18,300 --> 00:05:19,600 Whereas well-shaped big triangles 71 00:05:19,650 --> 00:05:21,600 may or not lie on the surface of the scene, 72 00:05:21,700 --> 00:05:25,150 and are classified by measuring the cross correlation between images where they have been seen. 73 00:05:25,900 --> 00:05:29,350 We generate the final scene surface by 74 00:05:29,400 --> 00:05:42,150 computing the Delaunay triangulation of the tracks restricted to the triangle soup. 75 00:05:43,100 --> 00:05:50,500 Outdoor scene example 1. 76 00:05:51,200 --> 00:06:00,500 Outdoor scene example 2.