|PFTrack Documentation||Node Reference|
The Photo Survey node can be used to automatically identify and match features between a set of still images, and automatically estimate the position of each camera and construct a sparse point cloud representing the 3D feature positions.
The set of images that the Photo Survey node will process must be provided by the Photo Input node.
Once the camera positions and sparse point cloud have been constructed, the overall scale and orientation of the scene can be set using an Orient Camera node.
The process of reconstructing a set of camera positions and a sparse point cloud is split into several stages, each of which is described in further detail below:
1. Point detection in each photo
2. Point matching between pairs of photos
3. Camera pose estimation and 3D point cloud construction
Generally, a large number of photos should be captured, covering the entire scene to be reconstructed from multiple positions and angles.
Photos should be taken from different camera positions. There must also be a significant amount of overlap between each photo to help with point matching. For example, when capturing photos around the outside of a building it would be advisable to take a step or two between each shot whilst pointing the camera at the building.
As the number of photos increases, so does the time required to reconstruct the scene. Point detection is performed once per photo, but point matching is performed once for every pair of photos. Unless the photos have been captured in a spatially ordered fashion (which will be described below), this means that as the number of photos increases, the time it takes to match points for the entire set can climb very quickly.
For example, when 10 photos are used, 45 pairs must be compared (9+8+7+..+3+2+1= 45). When 100 images are used, over 4900 image pairs must be examined which will take about 100 times as long as when 10 images are used. As a general rule, doubling the number of images means it will take four times as long to match.
PFTrack separates the process of detecting and matching points from that of solving for the camera positions and 3D point cloud, so the time consuming matching process generally only needs to be performed once for each scene. Point matches can then be edited as required, and the 3D camera positions and point cloud re-solved after each edit to ensure an accurate scene is reconstructed.
The overall quality of the solution is strongly affected by the number of points that are picked in each photo. There must be enough points distributed over each photo area to ensure an accurate estimate of the camera position can be made, but including too many poorly located points (for example, by reducing the Pick threshold to a very low value) may reduce the overall accuracy.
The number of points picked in each photo also affects the overall time required to match the points and solve for the camera positions.
The Matching Mode menu can be used to select an appropriate mode for matching points between images. The default option here is Exhaustive which will match points from one image to all other images.
Changing this to Ordered will match images in an ordered fashion from the first frame to the last. This can be used to speed up the matching process in situations where the images have been captured in a controlled order whereby the spatial location of one image is similar to both its neighbouring images, and each sequential image pair views a similar part of the scene. When this occurs, instead of matching one image to every other image, it will only be matched to the previous and next images in the sequence. This can greatly accelerate the matching process when large numbers of images are being used.
Another matching mode option is Ordered loop which is similar to the Ordered mode as described above, but the first image of the sequence will be additionally be matched against the last image, and vice-versa. This can be used in situations where, for example, images of a building have been captured by walking in one direction around the building exterior and the last image of the sequence is viewing the same part of the scene as the first image. In this case, enabling this option will often increase the accuracy of the final 3D point cloud as it will attempt to ensure that points visible in the first and last images are matched together.
Finally, if GPS position and orientation metadata is available (for example, captured by an airborne drone), this can be used to accelerate the matching process by enabling the From Metadata matching mode.
Regions can be specified in multiple images to assist with point matching. A region constrains the locations to which a point can be matched in other images. When a region is placed in an image, every point that falls inside it can only be matched to another point inside the region in other images. This can be helpful when using images of a scene that contains large amounts of similar image structure.
For example, the front and left sides of a building may contain many identical windows, and without any other constraints, points picked on the front of the building may be incorrectly matched to points on the side of the building in other images. Whilst the solver is robust to a certain amount of incorrect matches, large numbers of them can affect the solution and cause the solver to fail.
By creating a region for the front of the building and drawing that region in every image where it is visible, any point picked on the front will be prevented from matching a point on the side of the building.
The following image shows a region covering one part of a building that has been placed in three different images.
Once points have been detected and matched between images, the 3D camera positions and point cloud can be constructed.
Whilst EXIF metadata is used for the camera focal lengths, it is important to recognize when this data may not always be as accurate as required. For example, many cameras only report the EXIF focal length metadata to the nearest whole number (i.e. 35mm). The actual focal length mat be slightly different from the reported value.
Although a difference of 0.1mm may not sound like much, this actually can correspond to an error of between 20 and 30 pixels when using high resolution images. In these situations, the Focal Adjust parameter can be used to increase the range over which the focal length is allowed to vary from the value reported by the EXIF data.
In order to get the most accurate estimate of camera positions and 3D point cloud, it is important to account for any lens distortion
present in the images. This can be done in several ways:
- Shooting calibration grids and building a distortion preset using the Photo Camera Preset editor - Using Automatic distortion correction and specify approximate bounds on the distortion coefficient.
It is important to provide approximate bounds on each distortion coefficient when using automatic estimation by adjusting the minimum and maximum allowable values. Generally, most camera lenses exhibit barrel distortion, which corresponds to positive distortion coefficient in the range 0.0 (no distortion) to 0.3 (fairly high distortion) with a typical value between 0.05 and 0.15.
Note that enabling automatic lens distortion correction can slow down the camera solver, especially when many different focal lengths are used by the cameras and all three parameters are being estimated.
After lens distortion has been removed, the camera's sensor size will be adjusted to account for the new image resolution.
The image below shows camera positions and point cloud reconstructed from a set of images with and without lens distortion correction. When lens distortion is ignored, errors in the position of wall structures can clearly be seen when viewed from above.
Estimates of camera pose are initialised using a pair of photos, Initial Photo A and Initial Photo B. These frames can be estimated automatically when the Set initial photos automatically option is enabled.
Note that the choice of initial photos can have a significant impact on the quality of the final 3D point cloud. It is important that there is a significant distance between the camera positions for photo A and photo B, and that there are enough points matched between the images.
PFTrack will attempt to find the most optimal pair of photos to use, but in some situations, initial photos can be identified that are not able to provide an accurate solution (this is especially true when there are a large number of badly matched points present in those images).
Because of this, it is often important to check the quality of the initial solution first by enabling the Solve for initial solution only option, before attempting to solve the entire scene.
Once a good initial solution has been found, further solves can be accelerated by un-ticking the Set initial photos automatically option which will avoid the need to re-calculate initial photos for every solve.
Solved photos will be displayed in the Photo Navigation bar using green indicators.
Once the solve has finished, points may be manually edited as described below, and the solution refined to account for these changes by clicking the Refine button.
Generally, the 3D position of most points in the cloud can be estimated accurately. These are shown as green dots in the Cinema window when the Show solved points display option is enabled. Some points cannot be reconstructed accurately, however, and these so-called Outliers will be shown in red, along with a line connecting the 3D point position with the 2D point position that was detected in the image.
Note: the more outliers exist in the set of point matches, the harder it is to reconstruct the camera positions and 3D point cloud. This is particularly true for architectural scenes that contain repeated horizontal structure such as windows. In these cases it can become difficult to distinguish one window from another and therefore care must be taken to ensure that the majority of point matches are correct in situations such as this. This is especially the case when the position from which the images are taken lies in a plane parallel to these repeated structures (for example, taking all images of the windows from head height whilst walking along the ground) because incorrect matches between different windows can still be triangulated accurately even though they should not form part of the final point cloud.
Once a 3D point cloud has been constructed, the number of points displayed (and passed down-stream for further processing) can be controlled using the Min Confidence % edit box. This specifies the minimum confidence in the 3D position that a point must have for it to be included in the point cloud. Increasing this value will remove low-confidence points from the cloud and provide a clearer (but more sparse) view of the reconstructed scene.
Points can be selected in either the Cinema or Viewer windows by clicking the Marquee button in the Point Editing controls and then clicking and dragging with the left mouse button to draw a selection rectangle. Alternatively, if the Shift key is held, a selection lasso will be drawn. Holding the Ctrl key will allow multiple selections to be made. Selected points are coloured purple. Clicking the Delete button will delete all selected points. Single points can also be selected when the Marquee button is not enabled by clicking with the left mouse button.
When only one point is selected, the frames in which it was detected will be displayed in the scrub-bar as vertical yellow indicators. The K- and K+ buttons can be used to quickly move between these frames to check the position of the point. If required, the Centre View display option can be enabled to centre the Cinema view on the point in each frame.
Clicking the Remove button will allow point matches to be removed from the current frame. Moving the moue over a point will highlight it in light-blue, and clicking the left mouse button will remove the point match from the current frame. This can be used to remove incorrectly matched points when the camera solver is failing to estimate camera pose correctly.
Alternatively, the Edit button can be clicked to allow point matches to be moved in the current frame. Again, hovering the mouse over a point will highlight it, and clicking and dragging with the left mouse button will move the detected point position.
To make it easier to see where a particular points exists is two frames at the same time, the Cinema view can be split by clicking the Dual View button. This will take the current frame and display it in the right-hand half of the Cinema window. The frame displayed in the left-hand half of the window can then be adjusted as usual using the frame controls (or cursor keys).
When Dual View is enabled (and, if the point cloud has already been solved and the Show solved points display option is disabled), the Connect button can be enabled allowing existing points to be matched between the two frames, or entirely new points to be created and matched. Hovering the mouse over an existing point will highlight it in blue. Clicking the left mouse button will then create a connecting line, allowing a point in the other frame to be matched by clicking again with the left mouse button. Alternatively, if the left mouse button is clicked without hovering over an existing point, a new point will be created. These features can be used to edit or create point matches in areas where none could be found automatically, and help generate a better quality 3D point cloud.
The camera controls contain information about the camera used to capture all the images in the clip.
Focal length: Te camera focal length for the current image.
Focal range: When photos with different focal lengths are being used, this displays the minimum and maximum focal length over the dataset.
Focal adjust: This value can be used to specify a small adjustment that is allowed for each frame. This can often improve the accuracy of the solution when focal lengths reported by EXIF header data are not exact.
Field of view: The horizontal and vertical field of view for the current photo, measured in Degrees.
Near/far planes: The near and far clipping plane for the camera.
When the camera is set to use automatic lens distortion correction in the Photo Input node, these controls can be used to indicate roughly how much distortion is present in the camera.
Range: This menu can be used to define the distortion range: Minimal, Moderate, Significant or Custom, where Minimal corresponds to a very small amount of distortion, and Significant corresponds to a wide-angle lens.
Lower and upper bounds on the distortion coefficient are provided on the right, which can be edited manually when in Custom mode.
Estimate: Enabling this option means the camera solver will attempt to estimate a suitable lens distortion coefficient during the camera solve.
Channels: Which of the red, green or blue channels will be used for point matching, and a mask controls to adjust how masks are used.
Pick Threshold %: The detection threshold that will be used to pick points from the image. Decreasing this value will cause more points to be picked. The default value is 20%.
P: Display a preview of points picked in the current image in the Cinema window.
Matching mode: The way points are being matched between photo. Options are Exhaustive, where points in one photo will be matched against points in all other photos; Ordered, where points in one photo will be matched only against those in the previous and next photos; and Ordered loop, which is similar to Ordered but also means the last photo will be matched to the first. Finally, From Metadata will use EXIF position and orientation information to accelerate the matching process.
Skip Photo: Skip (or un-skip) the current image when detecting and matching points.
Edit ROI: Allow the region of interest (ROI) for point detection to be adjusted in the Cinema window. Click and drag with the left mouse button to adjust edges of the ROI.
Reset: Reset the point detection and matching parameters to their default values.
Auto Match: Start the point detection and matching process. If the Shift key is held whilst clicking this button, the process will be performed in the background.
New Region: Create a new matching region.
To draw the region in the current image, make sure it is selected in the regions list and then click with the left mouse button in the Cinema window to place vertices around the area of the image you wish to constrain. Drawing can be aborted at any time by pressing the Escape key, and the region must be closed by clicking again on the first vertex. Once a region has been drawn in one image, you must also draw it in all other images that view the same part of the scene. To remove a region from the current image, double click with the left mouse button in the Cinema window.
Delete Region: Delete the currently selected matching region.
Dual View: Split the Cinema window down the middle, keeping the current image on the right-hand side. Navigating to another image will update the left-hand side of the split, allowing points to be manually matched between two different images.
Marquee: Allow multiple points to be selected in either the Cinema or Viewer windows. Selected points are shown in purple. Clicking and dragging with the left mouse button will draw a rectangular region for selection, and holding the Shift key will allow a lasso to be drawn. Holding the Ctrl key whilst drawing will ensure that previous selections are kept. Note that because the left mouse button is used to select from the Viewer windows, the Alt/Option key must be held to rotate the camera.
Delete: Delete all selected points.
Remove: With this button enabled, clicking on a point in the Cinema window with the left mouse button will remove that point from the current image.
Edit: With this button enabled, clicking and dragging a point in the Cinema window with the left mouse button will allow the point's position in the current image to be edited.
Connect: With this button enabled, new points can be created and matched or matches between existing points can be created by clicking with the left mouse button. This button is only available when the Cinema window is split, and (if the point cloud has already been solved) and the Show solved points option is disabled.
K- and K+: When a single point is selected, navigate to the previous or next keyframe where the point was matched.
The two labels at the top indicate the two initial photos used by the camera solver (Initial Photo A and Initial Photo B). Clicking the Set button will set either to the current photo. These buttons are only available when the Set initial photos automatically option is disabled.
Set initial photos automatically: When this option is enabled, the two initial photos used to construct the point cloud will be estimated automatically. Note that this will increase the time it takes to reconstruct the scene.
Solve for initial solution only: When this option is enabled, the reconstruction will stop once the camera position for the initial frames has been estimated. This can be used to check that the initial frames provides an accurate initial point cloud before the other frames are added to the scene.
Solve All: Start the solver to reconstruct the camera positions and the point cloud. Holding the Shift key whilst clicking this button will run the solver in the background.
Refine: Refine the current estimates of camera position and 3D point cloud. Holding the Shift key whilst clicking this button will perform a longer refinement.
Solve Photo: Solve for the current photo only, using the existing point cloud positions, and then solve for any additional points that are visible in the current frame.
Un-Solve Photo: Remove the current photo from the reconstruction and remove points from the point cloud that can no longer be positioned accurately once the current frame is removed.
Remove Outliers: Remove all point matches that are classed as outliers (i.e. that are shown in red when the Show solved points option is enabled.
Min Confidence %: Specify the minimum confidence in the reconstructed 3D position that a point must have before it is included in the point cloud and passed down-stream to other nodes. The default value is 0.0% which means all points will be displayed and passed down-stream. Increasing this value will mean more points are removed from the solution because their 3D position cannot be estimated confidently.
Show ground: When this option is enabled, the ground plane will be displayed in the Viewer windows.
Show all cameras: When this option is enabled, all camera frustums will be displayed in the Viewer windows.
Show region boundaries: When this option is enabled, region matching boundaries will be displayed in the Cinema window.
Show solved points: When this option is enabled, only solved 3D point positions will be displayed in the Cinema window.
Show all cloud points: When this option is enabled, all cloud points will be displayed in the Cinema and Viewer windows. When this option is disabled, only those points that have been matched in the current frame will be displayed.
Show cloud colours: When this option is enabled, points displayed in the Viewer windows will be coloured using the pixel colour from the corresponding location in the source images. When disabled, the points will be coloured green, red or white according to their inlier status.
Show matches only: When enabled, only those points that have been matched between the left and right-hand images in the split window will be displayed. This option is only available when the Show solved points option is switched off and the Cinema window is split.
Show all match lines: When enabled, blue match lines will be drawn between each point that has been matched in the left and right-hand images of the split window. When this option is disabled and the Remove, Edit or Connect buttons are active, match lines will only be drawn for the point over which the mouse cursor is hovering. This option is only available when the Show solved points option is switched off and the Cinema window is split.
Show user matches only: When enabled, only those points that have been matched manually using the Connect button will be displayed. This option is only available when the Show solved points option is switched off and the Cinema window is split.
Show undistort grid: When this option is enabled, the distortion grid will be displayed as an overlay in the Cinema window (this option is only available when Auto-Undistort is enabled.
Centre View: When enabled, the Cinema window will be panned so the selected point is at the centre.
Keyboard shortcuts can be customised in the Preferences.
Show All Cameras
Show Region Boundaries
Show Solved Points
Show All Points
Show Cloud Colours
Show Matches Only
Show All Match Lines
Show User Matches Only