Wiki

This is a Wiki for Underwater Photogrammetric 3D Reconstruction Using Metashape It is a ressource containing best practice and problem-solving strategies for photogrammetric reconstruction with Metashape (version 1.7.4) for the underwater AUV/ROV use case. This is not a replacement of the user manual as it is neither exhaustive nor does it aim to be.

1. Workflow Automation:

Metashape can use batch processes to automate most of its calculation steps. When not using batch processes, users must perform all calculations individually. Batch processes are very useful for datasets for which no manual intervention is needed. Another advantage of a batch process is the standardization of settings. As those are fixed in the batch process files, it leaves less rooms for errors or inconsistencies and it increases consistency between projects.

However, this document focuses on manual processing as users will likely only need it in case something does not work as intended.

Figure 1.1: Metashape standard workflow. Left path: 3D Model. Right path: Photomosaic

2. Preprocessing

Sometimes it is beneficial to get rid of illumination artifacts in the images before starting the work in Metashape.

Tomato Tool

The Tomato Tool Image Normalization Cuda is a GEOMAR solution that uses a running average filter to equalize the illumination across images. The tool calculates the illumination cone, the average illumination per pixel based on neighboring images in time. By subtracting the average illumination the images are normalized. This is especially useful for image taken from constant flying height.

Dark Table

Darktable is a powerful open source photography workflow application. It best works with RAW images but also JEPG images can be enhanced. The basic steps that need to be done are to import all images with ‘add to library...’ in the lighttable mode, see Figure 2.1 (Afterwards switch to the darkroom mode and make changes to an image that is representative for the dive. The (possibly most) relevant modules are ‘shadows and highlights’ and ‘color correction’. With ‘shadows and highlights’ dark parts can be made brighter and very bright parts darker. The ‘color correction’ module can be used to shift the overall color of the image towards a (for the user) more realistic and color, e.g. if the image is way to blue. If you are satisfied with the result compress the history stack and create a style from the current history stack, see Figure 2.2. To apply the style to all images switch to the lighttable mode, select all images (Ctrl-A) and click ‘apply’ in the styles window, Figure 2.3. To export the images, choose a folder and the export settings and click export.

For a detailed description of the software visit darktable.

Figure 2.1: Darktable batch processing and exporting.

Figure 2.2: Darktable modules and styles.

Figure 2.3: Darktable working modes and import.

Video to Images

Metashapes works on images. In order to extract individual images from a video file the Tomato Tool FrameExtract can be used.

Alternatively, one can use ffmpeg directly.

Example command:

ffmpeg -i videofile.mkv %04d\_frame.png

This command will extract all frames from a video file. The naming scheme for the resulting image files of the individual frames uses a zero padded four-digit number followed by “_frame.png”. The -r option can be used to reduce the framerate. e.g. to 20 fps:

ffmpeg -i videofile.mkv -r 20.0 %04d\_frame.png

The piratical upper limit for the fps is determined by your camera settings during recording, which in turn is limited by the recording device and for most system can be significantly larger than 20fps. The minimum value is determined by the overlap between images, which should not be below 50%.

In practice there is diminishing returns increasing the number of images without increasing the number of angles and positions of the cameras, not only because computation time increases. Increasing the fps increases the matches of marine snow. Even if correct particles are matched between images, those are non-constant and therefore undesirable.

Noise Reduction

Too much noise in the images, e.g. caused by high camera gain settings, even if hardly visible for the human eye can lead to Metashape not properly finding features.

A simple way to mitigate the effect was found in using ImageMagicks -enhance option. For bulk editing use e.g.:

mogrify -format jpg -path "/your/output_dir" -enhance "/your/input_dir/*.jpg"

3. Camera Settings

(fisheye vs. frame and Camera Calibration in general)

In Metashape, the camera settings can be found under “Tools” -> “Camera Calibration”. If you already have a camera calibration (for the underwater case!), you can load the corresponding file here or enter the values manually. If you do not have a camera calibration, Metashape will find a good solution in most cases using the “adaptive Camera model fitting” in Align images (see 6.1 Settings). In both cases you will need to set the camera Type here! Most likely it will either be the classic “frame” model or a “fisheye” instead (see Error: Reference source not found). This option indicates the camera lens system. For the underwater case this will be extended by either an additional flat- or domeport for the camera housing that is not modeled by metashape.

It is possible to save Camera Calibrations into a settings file. If you have multiple data sets from the same camera you can reuse your Camera Calibration. If, however, Camera settings are changed or the housing is opened between dives, it might be necessary to redo the entire Calibration or only use the saved calibration settings as a starting point. The difference between runs can be significant and relying on an incorrect calibration may lead to warping of your model.

Something else to consider here is that Metashape assumes the pinhole camera model for its internal calculations. The Camera Calibrations added and shown in Figure 3.1 source not found describe said model. In underwater optics there is a second distinction between in camera models in dome and flat port cameras, due to the additional water-glass-air interface of the housing. If this interface is a well calibrated dome port (Adjustment and Calibration of Dome Port Camera Systems for Underwater Vision) the pinhole camera model still holds true, however for flat ports the underlying assumption of the pinhole camera does not. Metashape cannot model this additional interface. In practice, this means that different distanced towards objects of the camera might require different calibrations that estimate the true calibration.

Figure 3.1: Camera Calibration dialogue box with Camera Types section expanded and highlighted.

4. Masking

Especially when recording images using an ROV, parts of the vehicle might always be visible in the images. Metashape does a great job on ignoring constants such as vehicle frames and even camera overlays. We did not find a single case where we manually had to mask specific areas. However, the option is available. If you find a need for this option, feel free to share your experience and extend this section.

5. Navigation Data or Reference Data

The metadata from the image exif-tag is automatically read by Metashape. If the navigation data is stored in this way, no additional action is necesssary.

Separate Image-Navigation File

Metashape can load navigation data from a csv-file. The columns should include a filename corresponding to the name of the image file name Metashape uses in your project. Additionally, columns for latitude, longitude and altitude (positive upward in relation to the geoid reference, which means that values in this column should be negative below the sea surface) should be included. Optional columns include attitude data (roll, pitch, yaw) and uncertainties for both position and attitude data. If your data is a classic survey (a downward looking camera for Metashape), your pitch should be close to zero.

Figure 5.1: Example navigation data in csv file format.

The navigation files can be loaded using the reference tab (see Figure 5.1)In the dialog box simply click the “Import Reference” button to open die dialogue.

Setting Values Manually

As shown in Figure 5.2, it is possible to manually edit the reference data for individual images or all of them simultaneously. This can come in handy to add estimated attitude data (e.g. 0° pitch). The accuracy values can be changed too.

Figure 5.2: By right-clicking within the Camera panel of the reference tab and selecting modify, the reference data of the images can be edited manually.

6. Align Photos

Metashape calculates feature points for each image (i.e. camera). It finds correspondences between image pairs in these feature points. If enough matches are found, a relative orientation and position between the Camera Centers is calculated. By incorporating the individual pairwise results into a global solution, the relative orientations and positions of all cameras are determined. If the original images are georeferenced, the result is automatically georeferenced, too.

This is the crucial step in the processing pipeline. If something goes wrong, it is likely that the reason is a miss-alignment that occurred during this processing step. Therefore, the sparse feature point cloud generated by this step should be checked for issues (compare section Issues). A local solution for a specific area indirectly effects other parts. Accordingly, this process is an iterative one.

The Camera position optimization that happens as part of this step can be executed individually by running the “Optimize Cameras” procedure from the “Reference” tab.

Sometimes, finding these problems using the sparse point cloud is not sufficient. It is significantly easier to find misaligned parts using the Dense Point cloud or the digital elevation model (DEM) (see section 9. Creating Derivative Products). Calculating these unfortunately is time-consuming and identifying a problem in these later stages means starting over from the Align Photos (i.e. before producing the sparse point cloud). As far as we know, there is no way to update the data products (derivatives). Instead they have to be calculated anew.

Settings

Usually, the default settings are serviceable. “Accuracy” can be increased to obtain slightly better results. This comes at the cost of longer computation time. Likewise, lower accuracy can be significantly faster. Medium settings are appropriate for most tasks. Arguably the most important setting is the type of camera (Frame vs. Fisheye) that has to be known (compare section 3. Camera Settings)

Figure 6.1 shows the available settings for the align Photos processing step. While accuracy is the value that determines the quality vs. computation speed ratio, some other parameters are important, too.

The “Reset current alignment” checkbox is irrelevant for the first execution. However, on succeeding runs this option deletes all previous matches between images. While the calculated key points on the individual images are still available, the whole process of actually matching images pairwise has to be repeated if this option is set. Use with caution.

Figure 6.1: The Align Photos dialogue box. These settings are used if the camera model is unknown and no navigation data is available to guide the matching process.

The “Adaptive camera model fitting” essentially allows Metashape to find the “best” camera model for the given cameras. This is useful if you are uncertain about your camera calibration or you are lacking one entirely. If you do have a good underwater camera calibration, do not use this option as it will overwrite it.

Issues

This is a non-exhaustive list of issues that can occur during the alignment phase.

Not All Images are Aligned

Figure 6.2: After Camera alignment, not all images are aligned. They are marked as "NA" in the Workspace Camera view. If a navigation is available they will appear as misaligned blue dots in the Model view (line of points to the right of the main point cloud).

This is a common issue and easily identified. Check the Cameras list in the workspace tab. Select the chunck you are working on and extend the list of cameras. The “NA” indicator in this list means “Not Aligned”. If individual Cameras are not aligned, this is not an issue. However, if a whole area of the mosaic is unaligned you might want to try to fix it manually (see section Align images locally). Sometimes there is literally no overlap between cameras. In this case split the Cameras into two or more parts by dividing them into sperate chunks and treat them as separate surveys. This way you will produce more than one photogrammetric model, but that’s preferable to one larger, yet incorrectly aligned solution.

Tilted Survey

Figure 6.3: The survey is relatively consistently matched; however, the whole thing is tilted towards the seafloor.

This issue prominently occurs when no navigation information is included. With no navigation, every orientation of the pattern is just as valid as the next one. There are two solutions for this problem. The first one is placing multiple markers on objects inside the images that have a known location and setting this markers positions manually in the “Reference” tab. In most cases this information is not known and the second option is more feasible, which is adding navigation. Most of the time simply adding the positions of cameras, even if they are just rough estimates based on USBL data, yields the desired outcome. Sometimes it is necessary to add attitude (orientation) data. If this does not exist it is possible to add a pseudo navigation by simply right clicking in the “Reference” tab image list and modify the value for pitch to set it to 0 (= downward looking camera). Metashape requires the other attitude and position fields to be set, too. Make sure that the checkboxes for the attitude data in the reference pane are actually checked!

Discontinued Survey Lines

Figure 6.4: Discontinuity within the survey in a lawn mowing pattern. The blue rectangles represent camera positions. If the platform takes photos continuously, the resulting cameras should form continuous lines. In this case the solution proposed by Metashape ends in two dead ends (red).

In a structured survey pattern, AUV lines are continuous. If the line of Cameras is suddenly broken even though the cameras in time are still aligned it is a clear sign that the line is discontinued. This problem comes in pairs, as there ought to be a discontinued line coming from the opposite direction (see Figure 6.4). Even if there is no obvious mismatch in the point clouds, this is an error and needs to be fixed.

To fix this issue simply select the latest 2-3 images from each end and place markers according to section Placing Markers. When running the next “Optimize Cameras” this issue should be resolved. Sometimes other areas have to be fixed first before a satisfying solution for a particular area is found.

In other cases, some images are not aligned to either side within a line. By incrementally settings markers starting from one end of the aligned part and on the unaligned images, it is possible to add the unaligned images and bridge the gap between the two ends (see section Align images locally for details).

Mismatched Point Clouds

A common problem is the misalignment of point clouds. Sometimes this issue is easily identified by checking the sparse point cloud for “layers” (see Figure 6.5). These layers occur between neighboring lines if they are not matched by the algorithm, yet either due to existing navigation or matches further down the line they are still indirectly related.

Figure 6.5: Mismatched lines lead to layers within the point cloud (red rectangles). The arrow marks a point of interest that is visible in multiple point clouds and can potentially serve as a starting point to realign the lines.

To identify the issue using the sparse cloud, tilt the camera to have a view of the survey so that the parallel lines are leading into the screen. The smaller the vertical displacement, the harder it is to spot the issue.

At the latest this issue becomes very prominent when looking at the DEM, as the discontinuity between the lines will be directly visible here. It is not very feasible however to calculate the DEM for this purpose, as this issue needs to be fixed in the align images phase. Accordingly, all calculations for the dense point cloud and DEM would have to be redone.

The issue can be resolved if the layers are actually covering the same area (i.e., there is overlap between them). If at a given location there is no overlap, but further down the line it does exists, the region can be indirectly fixed by solving the issue there. If two lines do not have overlap at all it might be prudent to divide the areas into separate chunks.

In case there is overlap you can proceed by setting ~7 markers (see Placing Markers) in at least three images on one side and another ~7 markers per image on images of the other side (side meaning the collection of images responsible for a specific point cloud layer – for most surveys this will be equivalent to two neighboring survey lines). Make sure to place the markers on each side separately. Proceed to find the corresponding points for all ~14 markers on both sides and run the camera optimization.

Placing Markers

A solution for several issues (e.g. unaligned images, discontinued survey lines or mismatched point clouds) involves setting markers manually to help Metashap align images. They are basically the manual intervention tool to put some sense into Metashapes automated solutions. The goal is to place markers on corresponding pixels in multiple images and effectively force Metashape to recognize them as identical. If done correctly this is a powerful tool to fix models, if markers do not mark the same spot on two or more images, the presented solution might mess up your model instead of fixing it!

Therefore, it is important to make sure markers correspond to the same locations. Double check if need be. Do this by selecting a marker in the model view, right-click and filter cameras by markers. You will get a list and the images on top are the ones you manually set. Make sure ALL of them point to the same target.

This is how you set “relative” markers, the only markers relevant in this document:

  1. Go to the Reference panel

  2. Right click in the marker section and “add marker”

  3. Open the first image and right click->Place Marker->Your Marker

  4. Open the second image and right click->Place Marker->Your Marker

Do NOT create a new marker directly on an image! This is tempting but will create a different kind of marker that we do not want to use for this. It takes the current estimated absolute position and fixes it, even if it is incorrect.

You can check how a marker was set in the reference panel. If X, Y and Z values are present, it is an absolute marker (see Figure 6.6). You do not want that anywhere on your images. Delete such markers immediately! You should only use relative markers.

The effect of this markers is not imminent. Only after running the “Align Images” step will the markers be used to optimize the camera positions. Alternatively the “Optimize Cameras” Button in the “Reference” tab will recalculate the camera positions and orientations without the need to match images again.

Figure 6.6: Two different kind of markers. Relative marker on top (point 1), Absolute marker on the bottom (point 2). The latter is not desirable.

Align Images Locally

While the “Align Images” step aligns all images, sometimes it is beneficial to only work on a subset of images at a time. The “Align images locally” method allows just that.

Select a subset of images in the workspace view or in the photos list. Right click -> “Align Selected Cameras”. In contrast to the workflow variant, this will align only the selection of images instead of the whole chunk and is significantly faster. This is especially useful when integrating new images to an already existing base. In combination with adding markers, this is the best way of adding previously unaligned images.

7. Dense Point Cloud Calculation

The second step is the dense point cloud calculation. Using the Workflow menu simply run it using the default settings. I strongly recommend enabling the “Calculate point confidence” option (see Figure 7.1), which is disabled by default. Later on, this will ease discriminating noise.

Figure 7.1: Dense Point Cloud Calculation with "Calculate point confidence" enabled.

8. Filtering and Data Cleaning

The dense point cloud serves as the basis for either 3D models or the DEM. Naturally we want to get rid of noise here, so that the resulting models can be as accurate as possible. There are two ways of doing so. The first one is to simply select unwanted points in the model view while the dense point cloud is active and deleting them by pressing the delete key (see Figure 8.1). This method is simple to execute but can be very time consuming.

Figure 8.1: Deleting points from the dense point cloud is a matter of simply selecting them in the model view and pressing the delete key.

Alternatively, we can automatically filter the dense point cloud using the previously calculated confidence (see section 7. Dense Point Cloud Calculation). In order to view the model confidence instead of its color, change the value of the 3x3 dot box in the toolbox layer on the top from Dense Cloud to Dense Cloud Confidence (see Figure 8.2). Low confidence means that a point is supported by fewer cameras and is thus considered less reliable. Under Tools->Dense Cloud->Filter by Confidence you can select a confidence range. By applying a range from 0-1 only points with low confidence are shown. Select and delete all of them to get rid of the points for good. Reset the filter to the full range of 0-255 (all points) in the filter. By increasing the range from e.g. 0-3 you will delete more points. Determining the confidence under which to delete points is a trade-off between point cloud density and point confidence. At different sections of the model, different levels of confidence may be appropriate.

Figure 8.2: Dense Point Cloud confidence view.

9. Creating Derivative Products

The dense point cloud is the basis for different products you might want to create. The most common one for our uses cases being the Orthomosaic with Digital Elevation Model (DEM). This is appears to only be an option if the model is georeferenced – usually automatically achieved by adding navigation to the cameras, even if the navigation data are relatively imprecise. If this does not exist, one can set the second kind (absolute marker) on an image and setting the coordinates manually. If physical markers or similar are absent however, this is more akin to a workaround and we do not recommended it, as setting absolute markers technically requires to know their absolute position.

In order to create the data products, run “build DEM” and subsequently “build Orthomosaic” from the workflow menu (compare Figure 9.1 and Figure 9.2). The settings in those Figures are sensible for most cases, however you might want to change a few:

Figure 9.1: Build DEM dialogue box.

Figure 9.2: Build Orthomosaic dialogue box.

Afterwards, right click the corresponding data products in the workspace view. And click “export DEM” –> “export TIFF/BIL/XYZ” or “export Othomosaic” –> “export JPGES/TIFF/PNG” respectively. (Hint: Orthomosaic not Orthophotos – Orthophotos are the original images projected onto your modelled surface, not a single piece of data, but a collection).

Figure 9.3: Export DEM dialogue box.

Figure 9.4: Export Orthomosaic dialogue box.

Make sure that the selected coordinate system is your target coordinate system. There might be a problem with the GeoTiffs generated here when changing the coordinate system from the internally used one during export. If that is the case you might have to redo the DEM and Mosaic calculation entirely, not just the export. This is however not certain.

It is recommended to set the “Write tiled TIFF” checkbox for both exports, as it eases later use for example in the Digital Earth Viewer. The “Write BigTIFF File” option is recommended, too. Otherwise there might be problems with GIS systems later.

It makes sense to use the same resolutions for export as of the original.

10. 3D Reconstruction

The focus of this section is the creation of 3D models of individual target objects. The description above was mainly aimed towards the generation of photomosaics of areas. Because the usage of Metashape remains generally the same, we will not repeat these points here, but instead reference the respective sections. Only parts that are specific to 3D reconstructions are discussed in this document.

Data Acquisition

In contrast to photo surveys of the seabed, acquiring data for a detailed 3D target reconstruction requires images of one object from different angles. While this can be partially achieved using a wide field of view camera and conducting tight survey lines with a downward-facing camera (e.g. on an AUV) that overlap multiple times, the resulting camera positions are still limited compared to ROV or diver operations. Therefore, we assume that the letter method is used.

To reconstruct objects in air and in a controlled environment, it is possible to position objects on a pedestal, so that the environment factor is minimal. The object can be viewed from every angle and at different distances. Since the pedestal is of no interest, it can be virtually ”removed“ by physically rotating the object and taking images of the bottom from another view (see Figure 10.1).

Figure 10.1: 3D Model of a UXO object (from in air data) with camera positions (in blue) surrounding the target from all sides.

For the underwater case, especially with larger targets or unexploded ordnance (UXO) that should not be manipulated this option is not available. The angles from where the object can be inspected are naturally limited by the seafloor surface around it. It is impossible to have a full reconstruction this way, since parts of the target will always be occluded. The goal is, therefore, to maximize the surface that can be reconstructed.

ROVs are restricted in their movement and camera position relative to their frame. A front mounted camera is good for orientation during diving, but not necessarily for navigating to specific positions with a given rotation. Instead of taking photos, we recommend recording a video instead. The frame rate can be below 30 fps. There is no need for a higher frequency as having more images is not necessarily an advantage, as it reinforces problems with particles in the water column (see section Align Images Issues).

Given the uncertainty of movement under water, using video is preferable since the target does not need to be in frame at a specific point in time. While sudden movements could ruin a photo, a video might only suffer from motion blur while the target is still in sight.

These restrictions are less problematic for divers. But even here video is advantageous since the number of extracted images, even with low frame rate, will be larger than would be the case if individual photos were taken. In the past we had problems with divers not taking enough footage and while this can be avoided with better instructions, a video circumvents the problem altogether. Divers have their own shortcomings, especially in areas with high current velocity and at greater depths.

Instead of a fixed number of positions and angles as in the in-air case, it is prudent to define survey lines or a survey pattern instead. Optimally, the field of view of the camera should always be on the target, not only to increase the total number of images, but also because as long as the target is in sight, it will be relatively easy to match consecutive images. During post-processing, this can be helpful to spot Registration errors (compare section mismatched point clouds)).

The distance to the target should be held constant. For the in-air case the distance is varied on purpose to a degree. For the underwater case images from too far away can potentially be harmful:

  1. Attenuation changes the color of your target (see Figure 10.2).

  2. Scattering will introduce more noise.

  3. for flat port Camera System: Camera Calibration is optimized for a given distance.

Similarly taking images from too close has its pitfalls:

  1. Disturbing flora and fauna on the target which can lead to changes in consecutive images that increase the difficulty of matching (see Figure 23).

  2. Disturbing sediment on and around the target, which can make a continuation of the survey impossible for a certain amount of time (see Figure 10.3).

  3. UXOs are dangerous/contaminated and a safety distance is recommended.

  4. External light sources can create undesired illuminations and shadow effects.

Distances outside a range of 50 cm to 3 m should therefore be avoided during the survey or should be removed in post processing.

Figure 10.2: The same object from different distances. The color of the object changes due to varying illumination and attenuation.

Figure 10.3: Images from a ROV dive. Top: undisturbed environment. Middle: ROV gets too close to the target. A sediment cloud is created and the force of the thrusters catapult a starfish into the water column. Bottom: The sediment cloud has dispersed.

Data Pre-Processing

In order to reduce processing time, it is recommended to reduce the number of images by removing transect to and from the target or other longer sections, in which the target is not visible. While footage of the surrounding is of less interest, it can still be helpful for localization purposes. So even if the target is not directly visible in small subsets of the data, as long as the number of images is relatively small, they should still be included in the project.

Reconstruction

Proceed with your data according to the previous sections: Load the images into a project, make sure your camera settings are correct (section 3. Camera Settings) add, if available, navigation data (section 5. Navigation Data or Reference Data) and start the process of aligning images (6. Align Photos. Similar to the regular survey case, the Align images part of the processing pipeline is the most crucial one, since it is this part, during which manual intervention might be required.

Align Images Issues

Noise in the Water Column

Weather, seasonal plankton growth and other factors, determine turbidity in the water column. Disturbances of the seafloor due to, e.g., ROV manoeuvring increases the number of particles during the operation. Especially with artificial light, these particles are observed by the camera and can be quite prominent. These particles are often an undesirable match within image pairs. This happens particularly often for consecutive frames and for higher frame rates. Since they are virtually indistinguishable from one another, matches between different particles can be observed, too. Since the particles are in constant flux due to currents and the recording vehicle/diver, the resulting matches are found everywhere within the water column without providing correct information about the relative camera poses. As a result, the sparse point cloud contains these features outside of the actual “solid” target and might lead to misalignments. Figure 10.4 shows such a sparse point cloud.

Figure 10.4: The untreated sparse point cloud of an UXO. Most points correspond directly to the actual target. A significant number of points are located in the water column however and are likely the result of particles in the water.

The relative camera positions are calculated based on all points in the sparse point cloud. In order to minimize errors due to the particle matches, it is best to remove said matches by selecting and deleting the points, as shown in Figure 10.5. To do this the view has to be rotated multiple times. For each projection points that are clearly outside of the target boundaries can be selected. They can be removed with the delete key. This is an iterative process, but it is unlikely to be complete in the sense that all erroneous particle-matched points can be selected at will. Nevertheless, this method allows eliminating most of the key points in the water column. By optimizing the Camera Positions (Using the “optimize Cameras” Button in the Reference tab), the new solution will not consider the previously deleted elements, the resulting model should be improved.

Figure 10.5: Sparse point cloud with key points selected, which are within the water column

Sometimes it is not straight forward to tell if certain key points should be part of the model or not. In this case, it might be a good idea to calculate a temporary dense point cloud first and use this to get an overview over the target first.

Registration Errors

Section Discontinued Survey Lines details a problem that occurs when the solution Metashape finds leads to sudden jumps in the navigation data. While ROV or diver movement can be more sporadic, the problem is still easily identifiable for continuous recordings by considering the trajectory of camera positions. Fixing misaligned trajectories is done the same way as for the case with regular survey patterns.

Missing Areas in Sparse Point Cloud

Some targets or at least parts thereof might simply not have enough distinguishable features to be reconstructed properly. While this problem originates already from the image alignment phase and is therefore present in the sparse point cloud, it often only becomes prominent in a cleaned dense point cloud. In this case, parts of the targets surface will simply not be represented as points even though cameras are facing the side. This problem often manifests itself first by unaligned cameras. While it is possible to use markers to force the cameras into an alignment, the problem still persists, as the number of key points on the target is simply too low. The key points can be observed on an image basis by enabling the show key points option (see Figure 10.6). Especially, if the distance to the target is larger, this can happen for all available images of a given surface. Unfortunately, there is no way to fix this during post-processing. Manually aligning all available images might improve the quality to an extent. However, the amount of work necessary may be inappropriate. It is therefore good practice to check key points in an image sequence before trying to manually aligning them. If for example the images look like in the right part of Figure 10.6, it is well worth it to include the image.

Figure 10.6: Two images with vastly different keypoint distributions. Top: Keypoints are mostly in seemingly unremarkable locations and actually correspond mostly to marine snow. Bottom: Keypoints are centered at the object of interest and mostly correspond to features related to the object.

One reason for limiting the range between cameras and targets is attenuation. Another one is the variety in details. For intricate structures many details might only be visible from close up. If a spot is observed from multiple camera clusters at different distances, it can happen that high-level details such as small cravices are smoothed over by the camera clusters that were recorded from farther away. Figure 10.7 is an example of this. In many ways this problem resembles the classic registration error, however the fix is different. The cameras that create the low detailed surface have to be found and reset (“Reset camera alignment”). It is possible to filter the images from the point cloud (select points, context menu, filter by points), but one has to be careful to identify the correct cameras. Matching the color of the points and looking at the point cloud through different cameras can help here.

Figure 10.7: Two surfaces are visible in the point cloud. One high resolution with relatively bright colors and an overlaying lower resolution surface with a darker hue. The letter is a result of matches from cameras further away and obstructs details of the object.

Perspective Changes

While shadows and changes in perspective do play a role in classic image surveys, they are way more prominent and problematic in the 3D case, simply due to the fact that the changes in perspective are desirable during data acquisition, and are, thus, larger. Objects can be viewed from opposing directions and more often than not key points are not recognizable in such cases. Even for small changes in the camera position, due to occlusions, features might be hidden or warped. Figure 29 shows the part of an object on the left and the right side. The right side was recorded from farther away however, as a result some prominent features are not visible. Finding correspondences for such an image pair is very challenging. Often it is a good idea to look at a temporary dense point cloud, recognize markers based on that and find the correspondences in the image (see Figure 10.8). Generally speaking it is not necessary to find such global correspondences. However, they help to align otherwise independent parts of the survey and avoid repeating surfaces.

Figure 10.8: Image pair with a small displacement in camera position. The green flags are valid markers and correspond to the same position in both images.

Figure 10.9: Image pair with large displacement. Again the green flags represent valid markers, however they might not be easily identified as such. Marker 88 in the centre is a good starting point to understand the relation between these images.

Figure 10.10: The left side shows a dense point cloud viewed from the camera position of the image to the right.

Issues with Dense Point clouds

Similar to Noise in the Water Column the dense point cloud might have errors in form of point clusters outside of the target and inside the water column (see Figure 10.11). The reason for these errors is the same as for the sparse point cloud and while cleaning the sparse point cloud reduces this effect, it is recommended to clean the dense point cloud, too. This time not in order to optimize camera locations, but to avoid creating wrong surfaces. In fact optimizing cameras after cleaning the dense point cloud will remove all your work, since you will have to recalculate the dense point cloud altogether.

Figure 10.11: Dense point cloud of the target. Errors in form of clusters inside the water column are visible (two examples are highlighted).

Some of these errors can be cleaned just like for the sparse point cloud. In addition, it is relevant to calculate the point cloud variance. Points that are only seen by a few cameras of less than seven can be automatically filtered. This is a great option to reduce the number of points that correspond to particles, as they are usually not observed by too many cameras. The process itself is described in 8. Filtering and Data Cleaning.

If the point clouds quality is sufficient, simply build the model based on the dense point cloud and the corresponding texture (see section 9. Creating Derivative Products).

Figure 10.12: : Top to bottom: a) unfiltered dense point cloud. b) variance of unfiltered point cloud. c) filtered dense point cloud to include only points visible by 1-7 cameras. d) Point cloud delta between a) and c)

Glossary

Term used in MetashapeMeaning
Align imagesArguably the most important processing step in the Metashape workflow. This step includes feature detection and matching. Based on that the relative positions between all cameras is calculated.
Key point/featureA pixel or a group of pixels identified by Metashape in the Align images step to be matched across images.
Fisheye lensA lens that generally speaking has a wider field of view compared to others at the cost of more distortion.
Term used in underwater photogrammetry (and not necessarily in Metashape)Meaning
ScatteringParticles in the water column interact with light. As a result, some photons are diverted from their original path. Reducing the amount of incoming light to the camera, and adding noise diverting photons into the camera, originating from incorrect locations.
AttenuationLight is absorbed in water depending on its frequency. As a result, the color balance changes for objects dependent on its distance.
AUVAutonomous underwater vehicle. The platform can be equipped with sensors to investigate the seafloor or the water column.
ROVRemotely operated vehicle. A direct wire connection to an operator allows to livestream data like video or sonar.
Pinhole camera modelThe classic ideal model of photogrammetry with an infinitely small aperture. In practice a lens system is used that is more complex, but allows to gather more light.
Flat PortIn order to bring cameras underwater, some kind of housing is needed in addition to the camera housing itself, to protect against the water and the pressure. The shape of the interface between housing and water can be flat (therefore Flat port) – This however results in an additional refraction effect according to Snell’s Law.
Dome PortIn contrast to a Flat Port, the Dome Port used a spherical interface. If the camera center is exactly at the center of the dome port, the rays passing through the air-housing-water interfaces are orthogonal and no additional refraction takes place.

jmohrmann@geomar.de