PFTrack Documentation Node Reference  

Photo Cloud

Usage  |  Controls

The Photo Cloud user interface

The Photo Cloud node can be used to generate a dense point cloud for a scene that has been surveyed using the Photo Survey node.

In addition to the Photo Survey, cameras from other sources can also be attached to the secondary inputs of the node. For example the Match Camera node could be used to track moving camera into the surveyed scene, and frames from that movie clip could be used to help generate the dense point cloud.

Note that when attaching additional cameras, they must be placed within the same coordinate system as the Photo Survey connected to the first input.

The Photo Cloud node will take advantage of OpenCL if a suitable NVIDIA or AMD graphics card is available. OpenCL support can be checked by looking in the application log, displayed by clicking the log button at the top-right of the main window. A graphics card with a minimum of 2Gb of RAM is required (4Gb recommended).

For large scenes consisting of many photos, at least 16Gb of system RAM should be available when generating depth maps and meshes at high resolution, and processing will take advantage of multiple CPU cores whenever possible in addition to OpenCL GPU acceleration.


The first input of the Photo Cloud node must be attached to the output of the Photo Survey node. Depth maps will be estimated for each photo, and then converted into a single dense point cloud.

To produce an accurate result, it is important to ensure that many photos area captured for each part of the scene. Generally, try to ensure that each part of the scene is visible in at least four different photos captured from similar nearby camera positions. If the separation between camera positions is too large, it may not be possible to identify enough corresponding points to construct an accurate depth map.

If in doubt, it is better to capture more images than are necessary than to not capture enough.

The following image illustrates a number of different camera positions that were captured whilst walking around the base of a statue. On the left, camera images have been taken at sufficient density to ensure each part of the model is visible in many frames, whereas on the right the cameras are too widely separated to ensure good coverage of the scene.

Examples of good and bad camera positioning

The general workflow for building a dense point cloud is as follows:

1. Specify the scene bounding box to enclose the area of interest in
the scene

2. Generate a depth map for each frame

3. Delete depth maps that are too noisy or don't contain enough
accurate points, and/or adjust the Min Confidence % threshold to ensure the overall point set does not contain too much noise.

4. Convert the depth maps in to a single dense point cloud

Setting up the scene

In order to prepare for depth map construction, the scene bounding box should be adjusted to ensure it encloses the part of the scene that is of interest.

By default, the bounding box is initialised to enclose most of the input points, although depending on the initial configuration of points, it may require further adjustment.

Different views of the scene bounding box

Note that for the best results, the bounding box should be roughly aligned with the major axes of the scene. For this reason, it may be necessary to use an Orient Camera node before the Photo Cloud node to adjust the overall scene orientation, ensuring the vertical Y axis is pointing upwards, and either the X or Z axis is aligned in the major horizontal direction.

When generating a depth map for each frame, the near and far camera plane distances will be calculated to ensure that everything inside the scene bound is contained within these planes. Generally, this is sufficient to calculate depth maps covering the entire scene.

For large scenes the near and far camera planes may be beyond what is required for each depth map. In fact, if the camera far plane is too far away with respect to the details in the photo, depth map construction can fail, or may produce inaccurate results.

Examples of good and bad camera near/far planes

In the screenshot above, one camera is viewing a close-up of a building wall. This part of the scene is much closer to the camera than the entire scene bound, and the default position of the far plane means that a poor quality depth map is generated (top). By reducing the far-plane distance so it better matches the content of the image, a much more complete depth map is generated (bottom).

Creating depth maps

Once the scene bound has been specified, depth maps can be created for each camera frame. As depth maps are generated, they are saved individually to disk in a binary file format and displayed in the Cinema and Viewer windows.

The resolution of the depth map can be specified as either low, medium or high resolution. Low resolution depth maps require less RAM to generate and can be calculated quicker than medium or high resolution depth maps. They also require less storage space on disk, but this comes at the cost of reduced accuracy.

As the depth map for a specific camera is generated, it uses information from several nearby cameras to help calculate a depth value for each pixel (this is the reason why it is important to ensure that each part of the scene is visible in several different nearby images, as described above). If a pixel is not visible in enough images, or cannot be located accurately enough due to significant perspective or illumination differences between the images, it will be ignored and a gap will appear in the depth map.

As depth maps are generated, the icons in the Photo Navigation bar will be updated to indicate success or failure as an indicator at the top-left of each photo thumbnail. These indicators will be coloured red for failure, and green for success.

Generation of a depth map for a particular photo can fail for several reasons, such as the lack of enough suitable nearby camera positions, or large perspective/illumination differences preventing enough pixels from being accurately matched between them.

Depth map calculations may also fail if the near and far camera planes are wildly far from the area of interest. If this happens, those depth maps can be deleted, near/far planes adjusted, and the Missing frames only option can be used to calculate new depth maps only for frames that do not already have one.

Using masks

If necessary, masks can be used to restrict the parts of each frame that are used to generate depth maps. This can often help in situations where, for example, the Photo Cloud node is being used to create a model of a set shot against green-screen, or when creating a model of a building exterior that contains large amounts of blue sky.

In these cases the parts of the image that should be ignored can be masked using a variety of tools. In the case of a green screen or blue sky, a Keyer mask can often provide a quick means of masking out the background and generating a more accurate set of depth maps.

Creating the dense point cloud

After depth maps have been generated, the final stage of the process is to convert them into a dense point cloud.

When points from each depth map are used to construct the dense point cloud, those with a confidence below the Confidence % level are ignored.

Whilst adjusting the Confidence % parameter upwards or downwards, depth maps displayed in the Cinema and Viewer windows will be trimmed accordingly to illustrate which pixels are to be included.

Low and medium resolution dense point clouds can be generated quicker than high resolution clouds, and require less RAM, but may not be able to represent all the details present in the depth maps.

The amount of RAM required to generate high resolution cloud can increase significantly when many high resolution depth maps are being used. For this reason, it is recommended to free as much RAM as possible (for example, by clearing the RAM cache before processing) to avoid running out of memory.

As points from each depth map are used to construct the dense point, they are checked against points from other nearby cameras. When depth map pixels are from multiple frames found to agree, they are combined together into a single point in the dense point cloud. The Depth Similarity % and Colour Similarity % parameters can be used to affect this behaviour by adjusting how similar points must be (in terms of their distance from the camera and pixel colour) before they are merged together.

Once the dense point cloud has been generated, it will be displayed in the Cinema and Viewer windows.



Bounding Box Edit: Allow the scene bounding box to be adjusted in the Viewer windows by clicking and dragging with left mouse button (hold the Ctrl key to adjust the furthest face of the bound instead of the nearest face).

Min. Near Plane: The minimum near-plane distance for the current frame or for all frames (default value is 0.5).

Max. Far Plane: The maximum far-plane distance for the current frame or for all frames (default value is 100.0).

Per-Frame: When enabled, the near and far plane values can be adjusted individually for each frame.

Reset: Remove all stored near and far plane values (or only the value for the current frame if Per-Frame is enabled).

Show Bound button: Toggle display of the scene bounding box in the Cinema and Viewer windows.

Show Input Cloud button: Toggle display of the sparse input point cloud in the Cinema and Viewer windows.

Show Ground Plane: Toggle display of the ground plane in the Cinema and Viewer windows.

Show Cameras: Toggle display of the cameras in the Viewer windows.

Depth Maps

Resolution: The resolution at which depth maps will be created. High resolution corresponds to the original image (up to a maximum size of 4K), whilst Medium and Low correspond to half and quarter proxy resolutions respectively.

Photo Spacing: The temporal sampling rate for generating depth maps. By default, this is set to 1 which indicates that a depth map should be generated for every photo. Increasing this to a value of 5 (for example) will generate a depth map for every 5th photo. This can be used to accelerate processing when very large numbers of photos have been captured.

Verification: The amount of point verification that occurs when generating each depth map. Strong which indicates that a point must be visible in several input images before it is created, and Weak (the default) will accept points that are only visible in one pair of frames, which can potentially increase the amount of points in each depth map at the cost of introducing additional noise or errors.

Current frame only: When enabled, clicking the Create button will create a depth map for the current frame only.

Missing photos only: When enabled, clicking the Create button will create depth maps for all frames that do not already have one. When both this option and the Current Frame Only option are disabled, depth maps will be created for all frames.

Create: Start the depth map creation process.

Delete: Delete the current depth map and remove the files from disk (Shift-click to delete all depth maps)

Show All Depth Maps: Toggle display of all depth maps in the Viewer windows.

Show Depth Maps: Toggle display of the depth map for the current frame in the Cinema and Viewer windows.

Show Colour: Toggle point colouring of the depth map for the current frame in the Cinema and Viewer windows.

Show Normals: Toggle shading of the depth map for the current frame in the Cinema and Viewer windows.

Show Confidence: Toggle display of point confidence of the depth map for the current frame, where red indicates low confidence and green high confidence.

Dense Point Cloud

Resolution: The resolution at which the cloud will be generated. Note that generating high resolution clouds may require significant amounts of system RAM and processing time.

Confidence %: The confidence threshold that a depth map pixel must have in order for it to be used when constructing the cloud (default value is 25%). As this value is adjusted, the depth map displayed in the Cinema and Viewer windows will be trimmed accordingly to show which points pass the confidence threshold.

Depth Similarity %: The amount of depth similarity that points from multiple photos must have before they are combined together in the dense point cloud (measured a percentage of the distance from a point to the camera - default value is 1%).

Colour Similarity %: The colour similarity threshold used to distinguish different points when building the dense point cloud (default value is 20%). Points with colours above this threshold are considered distinct and are not combined together.

Create: Start the mesh creation process.

Clear: Clicking this button will clear the dense point cloud and delete the corresponding data file from disk.

Show Dense Cloud: Toggle display of the dense point cloud.

Show Colour: Toggle mesh colouring in the Viewer windows.

Show Shading: Toggle shading of the point cloud using lighting and point normals.

Default Keyboard Shortcuts

Edit Bound


Show Bound


Show Input Cloud


Show Ground


Show Cameras


Current Frame Only


Create Depth


Clear Frame


Show All Depth


Show Depth


Show Depth Colour


Show Depth Shading


Show Depth Confidence


Create Dense Point Cloud


Show Dense Point Cloud


Show Point Cloud Colour


Show Point Cloud Shading