Vision Parameter Adjustment Guide for Ordered Loading/Unloading of Planar Workpieces (Parallelized)
About 13281 wordsAbout 44 min
Getting Started
Background
The "Planar Workpiece Ordered Loading/Unloading (Parallelized)" scene has been added as a vision acceleration solution for ordered planar workpieces. The key feature of this scene is the improved vision computation cycle time: by using parallel computation to reduce processing overhead, grasp points are generated directly.
Unlike the way visual acceleration mode is enabled in other scenes, the planar ordered scene is activated by creating a new job.
Job Scene Selection
In the current planar ordered scene, there are two available job scenes. A comparison is shown in the table below.
| Workflow | Planar Ordered Loading/Unloading | Planar Ordered Loading/Unloading (Parallelized) | Note |
|---|---|---|---|
| Accuracy | High — depends on point cloud density and scene point cloud consistency | Relatively high — depends on image features, point cloud density, and scene point cloud consistency | / |
| Speed | Relatively fast (single instance) | Fast (multiple instances) | Only the parallelized mode can output all valid results in a scene |
| Parameter tuning | Moderate — requires experience with matching parameter tuning | Simple — relatively straightforward and fixed (similar to general workpieces) | Some Advanced-class parameters in the parallelized mode will be hidden in future updates |
| Applicability | Strong — applicable to all planar workpieces | Relatively weak — current version supports inconsistent incoming workpiece orientations | Multi-template mode has been added for parallelized processing to support multiple incoming orientations |
| Template creation difficulty | Moderate — software-assisted | More complex — current version requires a script | Template creation will be integrated into PickWiz in a future parallelized update |
| Template characteristics | Reference | Select a complete scene instance point cloud that is relatively centered in the camera field of view | / |
Setting Up a Project
(1) Create a new Planar Workpiece Ordered Loading/Unloading (Parallelized) project (the project name and path can be customized; the project name must not contain Chinese characters)
Workpiece type: Planar workpiece (not round, cylindrical, or quadrilateral, and with minimal difference between front and back faces)
(2) Configure the camera and robot
(3) Add a workpiece

- Workpiece Information
The workpiece name can be customized. The workpiece type defaults to "Standard Workpiece" and cannot be changed. The workpiece ID can be customized and is used to automatically switch workpieces during robot grasping.
Point cloud file: The workpiece point cloud template. The point cloud file for the Planar Workpiece Ordered Loading/Unloading (Parallelized) scene is special — see 2.2.1 Template File Path for instructions on how to create it.
Fine matching point cloud template: Used for fine matching.
Camera parameters: Not required.
- Model Information
Vision model: The 2D recognition solution used for planar workpieces is CAD-based synthetic data training (One-Click Connect). The vision model for each planar workpiece application must be trained via One-Click Connect.



Mesh file: Typically the workpiece CAD file is uploaded. To eliminate noise, the mesh file must be normalized; normalization can also be done during point cloud template creation.
Workpiece properties: Elongated, symmetric, highly reflective, low solidity.
Incoming form: Custom incoming form — enter the incoming form; tightly packed — enter the number of workpieces per row and column.
Work environment: Load an environment file. The environment used for data generation in One-Click Connect will be automatically replaced with the loaded environment to improve recognition accuracy.
Workpiece texture: Load workpiece texture. During One-Click Connect model training, the loaded texture will be used for data augmentation to improve recognition accuracy.
Mix unordered scene data: When enabled, One-Click Connect training will simultaneously generate synthetic data for both unordered and ordered scenes to improve recognition accuracy.
Maximum model detection count: Default is 20; adjust according to scene requirements.
- Grasp point: Set grasp points based on the workpiece.

Absolute coordinate system: Uses the initial point as the origin; the initial point comes from the workpiece point cloud and CAD.
Grasp point coordinate system (offset): Uses the current grasp point as the origin.
(4) Add end-of-arm tool, eye-hand calibration, and ROI.
(5) Optional features: Instance optimization, collision detection, visual classification, point cloud noise removal, bin/workpiece collision detection.
Instance optimization: Optimizes the instances generated by the model by processing instance masks.
Collision detection: Detects collisions between the end-of-arm tool and the container, and filters out grasp poses that may cause collisions. Collision Detection Guide
Visual classification: Used to recognize different textures, orientations, and other features of the same workpiece. Visual Classification Guide
Point cloud noise removal: Load the workpiece's point cloud template to filter noise from instance workpiece point clouds. Point Cloud Noise Removal Guide
Bin/workpiece collision detection: Detects collisions between the end-of-arm tool and the scene (bin, scene point cloud), and filters out grasp poses that may cause collisions. Bin/Workpiece Collision Detection Guide
Vision Parameters
- 2D Recognition: Recognizes and segments instances from the actual scene.
Preprocessing: Processes the 2D image before instance segmentation (commonly used: fill depth map holes & edge enhancement & extract top-layer texture & remove image background outside ROI 3D).
Instance segmentation: Segments instances (scale ratio & confidence lower threshold & auto augmentation). Uncheck to disable acceleration and return masks.
Point cloud generation: The method for generating instance point clouds — use the segmented instance mask or bounding box / use the filtered instance mask or bounding box to generate instance point clouds.
Instance filtering: Filters the segmented instances.
Instance sorting: Sorts the instances.
- 3D Computation: Computes the pose of instances in the camera coordinate system and generates grasp points.
Preprocessing: Processes 3D point clouds before computing grasp points.
Pose estimation: Computes the pose of instances in the camera coordinate system (coarse matching, fine matching) and generates grasp points.
- Grasp Point Processing: Filters, adjusts, and sorts grasp points.
Grasp point filtering: Filters grasp points.
Grasp point adjustment: Adjusts grasp points.
Grasp point sorting: Sorts grasp points.
1. 2D Recognition
This section mainly explains the preprocessing, instance segmentation, instance filtering, and instance sorting related functions that affect 2D image recognition results, along with parameter tuning suggestions
1.1 Preprocessing
Preprocessing for 2D recognition processes the 2D image before instance segmentation.
1.1.1 Bilateral Filtering
- Function
Image smoothing based on bilateral filtering
- Parameter Description
| Parameter | Description | Default Value | Range |
|---|---|---|---|
| Maximum Depth Difference | Maximum depth difference for bilateral filtering | 0.03 | [0.01, 1] |
| Filter Kernel Size | Convolution kernel size for bilateral filtering | 7 | [1, 3000] |
1.1.2 Depth to Normal Map
- Function
Compute pixel normals from the depth map and convert the image into a normal map
1.1.3 Image Enhancement
- Function
Common image enhancement operations such as saturation, contrast, brightness, and sharpness
- Parameter Description
| Parameter | Description | Default Value | Range |
|---|---|---|---|
| Image Enhancement Type | Enhances a specific element of the image | Contrast | Saturation, Contrast, Brightness, Sharpness |
| Image Enhancement Threshold | How much a specific element of the image is enhanced | 1.5 | [0.1, 100] |
1.1.4 Histogram Equalization
- Function
Improve image contrast
- Parameter Description
| Parameter | Description | Default Value | Range |
|---|---|---|---|
| Local Mode | Local or global histogram equalization. When selected, local histogram equalization is used; when cleared, global histogram equalization is used. | Selected | / |
| Contrast Threshold | Contrast threshold | 3 | [1,1000] |
1.1.5 Filter Depth by HSV
- Function
Filter the depth map according to color values
- Parameter Description
| Parameter | Description | Default Value | Range |
|---|---|---|---|
| Fill Kernel Size | Size of color filling | 3 | [1,99] |
| Filter Depth by HSV - Maximum Color Range Value | Maximum color value | [180,255,255] | [[0,0,0],[255,255,255]] |
| Filter Depth by HSV - Minimum Color Range Value | Minimum color value | [0,0,0] | [[0,0,0],[255,255,255]] |
| Keep Regions Within the Color Range | If selected, regions within the color range are kept; if cleared, regions outside the color range are kept. | / | / |
1.1.6 Gamma Correction
- Function
Gamma correction changes image brightness
- Parameter Description
| Parameter | Description | Default Value | Range |
|---|---|---|---|
| Gamma Compensation Coefficient | When this value is less than 1, the image becomes darker; when it is greater than 1, the image becomes brighter. | 1 | [0.1,100] |
| Gamma Correction Coefficient | When this value is less than 1, the image becomes darker and is suitable for images that are too bright; when it is greater than 1, the image becomes brighter and is suitable for images that are too dark. | 2.2 | [0.1,100] |
1.1.7 Fill Depth Hole
- Function
Fill hole regions in the depth map and smooth the filled depth map
- Use Case
Due to issues such as obstruction caused by the Target Object structure itself or uneven lighting, parts of the Target Object may be missing in the depth map
- Parameter Description
| Parameter | Description | Default Value | Range |
|---|---|---|---|
| Fill Kernel Size | Hole filling size | 3 | [1,99] |
Fill kernel size can only be an odd number
- Parameter Tuning
Adjust according to the detection results. If filling is excessive, reduce the parameter; if filling is insufficient, increase the parameter.
- Example





1.1.8 Edge Enhancement
- Function
Set the texture edge areas in the image to the background color or to a color with a large contrast from the background color, so that the edge information of the Target Object is highlighted
- Use Case
Target Objects occlude or overlap each other, resulting in unclear edges
- Parameter Description
| Parameter | Description | Default Value | Range | Tuning Recommendation |
|---|---|---|---|---|
| Normal Z-Direction Filtering Threshold | Filtering threshold for the angle between the normal vector corresponding to each point in the depth map and the positive Z-axis direction of the camera coordinate system. If the angle between a point's normal vector and the positive Z-axis direction of the camera coordinate system is greater than this threshold, the color at the corresponding position in the 2D image is set to the Background color or to a color with a large contrast from the Background color. | 30 | [0,180] | For flat Target Object surfaces, this threshold can be stricter. For curved surfaces such as sacks, increase it appropriately according to the surface inclination of the Target Object. |
| Background | RGB color threshold of the background color | 128 | [0,255] | |
| Automatically Adjust Contrast Background | Selected After Automatically Adjust Contrast Background is enabled, the colors of points in the 2D image whose angles are greater than the filtering threshold are set to a color with a large contrast from the Background color. If it is not selected, the colors of points in the 2D image whose angles are greater than the filtering threshold are set to the color corresponding to the Background color. | Cleared | / |
- Example
In a scene with a pile of sacks, the sacks occlude each other. Enable Edge Enhancement to distinguish the edges of individual sacks, as shown below.








1.1.9 Extract Top by Depth
- Function
Extract the texture of the topmost or bottommost Target Object, while setting other regions to the background color or to a color with a large contrast from the background color.
- Use Case
Applicable to single-carton depalletizing scenarios. Factors such as poor lighting conditions, similar color textures, tight stacking, interleaved stacking, or occlusion may make it difficult for the model to distinguish texture differences between the upper and lower cartons, which easily leads to false detections.
- Parameter Description
| Parameter | Description | Default Value | Range | Unit | Tuning Recommendation |
|---|---|---|---|---|---|
| Distance Threshold (mm) | If the distance from a point to the topmost plane (or bottommost plane) is lower than this threshold, the point is regarded as lying on the topmost plane (or bottommost plane) and should be retained. Otherwise, it is regarded as a point on the lower layer (or upper layer), and the color of the lower-layer (or upper-layer) point is set to the background color or to a color with a large contrast from the background color. | 50 | [0.1, 1000] | mm | Generally set to 1/2 of the carton height |
| Cluster Point Cloud Quantity | Expected number of points participating in clustering, that is, the number of sampled point clouds within the ROI 3D area | 10000 | [1,10000000] | / | The larger the Cluster Point Cloud Quantity, the slower the model inference speed but the higher the accuracy; the smaller the Cluster Point Cloud Quantity, the faster the inference speed but the lower the accuracy. |
| Minimum Category Point Quantity | Minimum number of points used for category filtering | 1000 | [1, 10000000] | / | / |
| Automatically Compute Contrast Background | Selected After Automatically Compute Contrast Background is enabled, regions outside the topmost (or bottommost) layer in the 2D image are set to a color with a large contrast from the background color threshold. If it is not selected, regions outside the topmost (or bottommost) layer in the 2D image are set to the color corresponding to the background color threshold. | Selected | / | / | / |
| Background Color RGB Threshold | RGB color threshold of the background color | 128 | [0, 255] | / | / |
- Example



1.1.10 Extract ROI 3D RGB
- Function
Remove the background outside the ROI3D area in the 2D image
- Use Case
Excessive image background noise affects detection results
- Parameter Description
| Parameter Name | Description | Default Value | Range |
|---|---|---|---|
| Fill Kernel Size | Size of hole filling | 5 | [1,99] |
| Iteration Count | Number of image dilation iterations | 1 | [1,99] |
| Automatically Compute Contrast Background | Selected After Automatically Compute Contrast Background is enabled, the area outside the ROI in the 2D image is set to a color with a large contrast from the background color threshold. If it is not selected, the area outside the ROI in the 2D image is set to the color corresponding to the background color threshold. | Selected | |
| Background Color Threshold | RGB color threshold of the background color | 128 | [0,255] |
Fill kernel size can only be an odd number
- Parameter Tuning
If you need to remove more background noise from the image, reduce the Fill Kernel Size
- Example





1.2 Instance Segmentation
1.2.1 Scaling Ratio
- Function
Improve the accuracy and recall of 2D recognition by scaling the original image proportionally before inference.
- Use Case
If the detection result is poor (for example, no instances are detected, instances are missed, a bounding box covers multiple instances, or a bounding box does not fully cover an instance), this function should be adjusted.
Parameter Description
Default Value: 1.0
Range: [0.01, 3.00]
Step Size: 0.01
Parameter Tuning
- Run with the default value and view the detection results in the visualization window. If no instances are detected, instances are missed, a bounding box covers multiple instances, or a bounding box does not fully cover an instance, this function should be adjusted.
In 2D recognition, the percentage shown on an instance is the Confidence score, and the number is the Instance ID (the recognition order of the instance).
In 2D recognition, the colored shadow on an instance is the mask, and the rectangular box surrounding the instance is the bounding box.
If good detection results still cannot be obtained after trying all scaling ratios, you can adjust the ROI area.
As shown below, when the scaling ratio is 0.7, the detection result improves significantly. Therefore, 0.7 can be determined as the lower bound of the scaling ratio range.



When the scaling ratio is 1.8, the detection result becomes significantly worse. Therefore, 1.8 can be determined as the upper bound of the scaling ratio range.



1.2.2 Confidence Lower Threshold
- Function
Keep only recognition results whose deep learning model confidence scores are higher than the lower confidence threshold
- Use Case
This function can be adjusted when the instances enclosed by the detection result do not match expectations
- Parameter Description
Default Value: 0.5
Range: [0.01, 1.00]
Parameter Tuning
- If too few instances are detected, reduce this threshold. If the value is too small, the accuracy of image recognition may be affected.



- If an excessively small lower confidence threshold causes incorrect instances to be detected and those incorrect instances need to be removed, increase this threshold. If the value is too large, the retained detection results may become zero, resulting in no output.
Example
When Lower Confidence Threshold is set to 0.5, the retained instance detection results are shown in the left figure below. When Lower Confidence Threshold is set to 0.8, the retained instance detection results are shown in the right figure below. The instances with scores of 77.60%, 74.61%, and 69.77% (lower than 80%) are filtered out.


1.2.3 Auto Augment
- Function
All input values of scaling ratios and rotation angles are combined for inference. All results greater than the set lower confidence threshold after the combinations are returned, which can improve model inference accuracy, but increases processing time.
- Use Case
A single scaling ratio cannot meet the requirements of the actual scene, causing incomplete detection, or the object placement tilt is relatively large.
- Example
If Automatic Enhancement - Scaling Ratio is set to [0.8, 0.9, 1.0] and Automatic Enhancement - Rotation Angle is set to [0, 90.0] , then the values of the scaling ratios and rotation angles are combined pairwise. The model automatically generates 6 images internally for inference, and finally merges these 6 inference results together, outputting the results greater than the lower confidence threshold.
Auto Augment - Scaling Ratio
- Function
Scale the original image multiple times and perform inference multiple times, then output the combined inference result
- Use Case
Incomplete detection because a single scaling ratio cannot satisfy actual scene requirements
- Parameter Description
Default Value: [1.0]
Range: the range of each scaling ratio is [0.1, 3.0]
Multiple scaling ratios can be set, separated by English commas
- Parameter Tuning
Fill in multiple scaling ratios from 1.2.1 Scaling Ratio that produce good detection results
Auto Augment - Rotation Angle
- Function
Rotate the original image multiple times and perform inference multiple times, then output the combined inference result
- Use Case
Used when the object placement deviates significantly from the coordinate axes
- Parameter Description
Default Value: [0.0]
Range: the range of each rotation angle is [0, 360]
Multiple rotation angles can be set, separated by English commas
- Parameter Tuning
Adjust Automatic Enhancement - Rotation Angle according to the object angle in the actual scene. The tilt angle can be judged from the sack pattern and bag opening shape, or from the carton edges and brand logo.
1.3 Point Cloud Generation
In depalletizing scenarios, instance point clouds are generally generated using Mask Form (After Segmentation) or Mask Form (After Filtering).
| Instance Point Cloud Generation Mode | Mask(Segmented) | — | Use the segmented instance mask to generate the point cloud |
| Bounding Box (Segmented) | Bounding Box Scaling Ratio (Segmented) | Use the segmented instance bounding box to generate the point cloud | |
| Whether Color Is Required When Generating the Point Cloud (Segmented) | Whether the generated instance point cloud needs attached color | ||
| Mask(Filtered) | — | Use the filtered instance mask to generate the point cloud | |
| Bounding Box (Filtered) | Bounding Box Scaling Ratio (Filtered) | Use the filtered instance bounding box to generate the point cloud | |
| Whether Color Is Required When Generating the Point Cloud (Filtered) | Whether the generated instance point cloud needs attached color |
If acceleration is not required, there is no need to use the Instance Filtering function. Use Mask (Segmented) to generate the instance point cloud, which can be viewed under the project storage folder \Project Name\data\PickLight\Historical Data Timestamp\Builder\pose\input folder containing the generated instance point cloud.

If acceleration is required, you can use the Instance Filtering function to filter instances and use Mask (Filtered) to generate the instance point cloud, which can be viewed in the generated instance point cloud under the project storage folder \Project Name\data\PickLight\Historical Data Timestamp\Builder\pose\input folder containing the generated instance point cloud

1.4 Instance Filtering
1.4.1 Filter by BBox Area
- Function Introduction
Filter according to the pixel area of the bounding box of the detected instance.
- Use Case
Applicable to scenarios where instance bounding box areas differ greatly. By setting upper and lower limits for the bounding box area, image noise can be filtered out to improve image recognition accuracy and avoid extra processing time caused by noise in subsequent processing.
- Parameter Description
| Parameter | Description | Default Value | Range | Unit |
|---|---|---|---|---|
| Minimum Area (Pixels) | This parameter sets the minimum filtering area for the bounding box. Instances whose bounding box area is lower than this value are filtered out. | 1 | [1, 10000000] | pixels |
| Maximum Area (Pixels) | This parameter sets the maximum filtering area for the bounding box. Instances whose bounding box area is higher than this value are filtered out. | 10000000 | [2, 10000000] | pixels |
- Example
Run with the default values. You can view the bounding box area of each instance in the log, as shown below.


Adjust Minimum Area and Maximum Area according to the bounding box area of each instance. For example, setting Minimum Area to 20000 and Maximum Area to 30000 will filter out instances whose pixel area is less than 20000 or greater than 30000. The instance filtering process can be viewed in the log.


1.4.2 Filter Based on Bounding Box Aspect Ratio
- Function Introduction
Instances whose bounding box aspect ratios are outside the specified range are filtered out
- Use Case
Applicable to scenarios where instance bounding box aspect ratios differ greatly
- Parameter Description
| Parameter | Description | Default Value | Range |
|---|---|---|---|
| Minimum Aspect Ratio | Minimum bounding box aspect ratio. Instances whose bounding box aspect ratio is lower than this value are filtered out. | 0 | [0, 10000000] |
| Maximum Aspect Ratio | Maximum bounding box aspect ratio. Instances whose bounding box aspect ratio is higher than this value are filtered out. | 10000000 | [0, 10000000] |
| Use X/Y Axis Side Length as the Aspect Ratio | By default, this option is cleared, and the ratio of the longer side to the shorter side of the bounding box is used as the aspect ratio, which is suitable when the lengths of the longer and shorter sides differ greatly. After selection, the ratio of the side length on the X-axis to the side length on the Y-axis in the pixel coordinate system is used as the aspect ratio, which is suitable when the ratios of the longer side to the shorter side of most normal instance bounding boxes are similar, but some abnormal recognized instance bounding boxes differ greatly in their X-axis length / Y-axis length ratio. | Cleared | / |
1.4.3 Filter by Category ID
- Function Introduction
Filter according to the instance category
- Use Case
Applicable to scenarios where multiple types of Target Objects are supplied
- Parameter Description
| Parameter | Description | Default Value |
|---|---|---|
| Retained Category IDs | Retain instances whose category IDs are in the list. Instances whose category IDs are not in the list are filtered out. | [0] |
- Example
1.4.4 Filter Instance Edge
- Function Introduction
Filter according to the long side and short side of the instance point cloud
- Use Case
Applicable to scenarios where the distances of the instance point cloud on the X-axis or Y-axis differ greatly. By setting the distance range of the instance point cloud, image noise can be filtered out, image recognition accuracy can be improved, and extra processing time caused by noise in subsequent processing can be avoided.
- Parameter Description
| Parameter | Description | Default Value | Range | Unit |
|---|---|---|---|---|
| Short Side Length Range (mm) | Side length range of the short side of the point cloud | [0, 10000] | [0, 10000] | mm |
| Long Side Length Range (mm) | Side length range of the long side of the point cloud | [0, 10000] | [0, 10000] | mm |
| Lower Edge Denoising Limit (%) | Extract the lower percentage limit of X/Y values (camera coordinate system) in the instance point cloud, and remove point clouds outside the upper and lower limits to avoid noise affecting length calculation | 5 | [0, 100] | / |
| Upper Edge Denoising Limit (%) | Extract the upper percentage limit of X/Y values (camera coordinate system) in the instance point cloud, and remove point clouds outside the upper and lower limits to avoid noise affecting length calculation | 95 | [0, 100] | / |
| Side Length Type | Filter according to the long side and short side of the instance point cloud. Instances whose long side or short side lengths are outside the range are filtered out. | Instance Point Cloud Short Side | Instance Point Cloud Short Side; Instance Point Cloud Long Side; Instance Point Cloud Long Side and Short Side | / |
- Example
1.4.5 Filter Based on Classifier Category ID
- Function Introduction
Filter instances based on the category ID from the classifier. Instances not in the reference categories are filtered out.
- Use Case
In multi-category Target Object scenarios, the vision model may detect multiple types of Target Objects, but the actual task may require only one specific category. In this case, this function can be used to filter out unnecessary Target Objects.
- Parameter Description
The default value is [0], which means that instances with category ID 0 are retained by default. Instances whose category IDs are not in the list are filtered out.
1.4.6 Filter by Color Range
- Function Introduction
Instances can be filtered out by three-channel color thresholds (HSV or RGB).
- Use Case
Cases where incorrect instances and correct instances have obvious color differences.
- Parameter Description
| Parameter | Description | Default Value | Range |
|---|---|---|---|
| Maximum Color Range Value | Maximum color value | [180,255,255] | [[0,0,0],[255,255,255]] |
| Minimum Color Range Value | Minimum color value | [0,0,0] | [[0,0,0],[255,255,255]] |
| Filtering Percentage Threshold | Color pass-rate threshold | 0.05 | [0,1] |
| Reverse Filtering | If selected, instances whose proportion outside the color range is lower than the threshold are removed. If cleared, instances whose proportion inside the color range in the instance image is lower than the threshold are removed. | Cleared | / |
| Color Mode | Color space selected for color filtering | HSV Color Space | RGB Color SpaceHSV Color Space |
- Example

1.4.7 Filter by Confidence
- Function Introduction
Filter according to the confidence score of the instance
- Use Case
Applicable to scenarios where confidence scores of instances differ greatly
- Parameter Description
| Parameter | Description | Default Value | Range |
|---|---|---|---|
| Reference Confidence | Retain instances whose confidence is greater than the threshold, and filter out instances whose confidence is less than the threshold. | 0.5 | [0,1] |
| Reverse Filtering Result | After reversal, retain instances whose visibility confidence is less than the threshold, and filter out instances whose confidence is greater than the threshold. | Cleared | / |
- Example
1.4.8 Filter by Instance PCD Quantity
- Function Introduction
Filter according to the number of downsampled instance point clouds
- Use Case
The instance point cloud contains a large amount of noise
- Parameter Description
| Parameter | Description | Default Value | Range |
|---|---|---|---|
| Minimum Point Cloud Quantity | Minimum point cloud quantity | 3500 | [1, 10000000] |
| Maximum Point Cloud Quantity | Maximum point cloud quantity | 8500 | [2, 10000000] |
| Filter Instances Whose Quantity Falls Within the Interval | If selected, instances whose point cloud quantity is between the minimum and maximum values are filtered out. If cleared, instances whose point cloud quantity is not within the interval are filtered out. | Cleared | / |
1.4.9 Filter by Mask Area
- Function Introduction
Filter image masks according to the sum of mask pixels (that is, the pixel area) of the detected instances.
- Use Case
Applicable to scenarios where instance mask areas differ greatly. By setting upper and lower limits for mask area, noise in image masks can be filtered out to improve image recognition accuracy and avoid extra processing time caused by noise in subsequent processing.
- Parameter Setting Description
| Parameter Name | Description | Default Value | Range | Unit |
|---|---|---|---|---|
| Reference Minimum Area | This parameter sets the minimum filtering area for the mask. Instances whose mask area is lower than this value are filtered out. | 1 | [1, 10000000] | pixels |
| Reference Maximum Area | This parameter sets the maximum filtering area for the mask. Instances whose mask area is higher than this value are filtered out. | 10000000 | [2, 10000000] | pixels |
- Example
1.4.10 Filter Based on Visibility
- Function Introduction
Filter according to the visibility score of the instance
- Use Case
Applicable to scenarios where instance visibility differs greatly
- Parameter Description
| Parameter | Description | Default Value | Range |
|---|---|---|---|
| Reference Visibility Threshold | Retain instances whose visibility is greater than the threshold, and filter out instances whose visibility is less than the threshold. Visibility is used to determine how visible an instance is in the image. The more the Target Object is occluded, the lower the visibility. | 0.5 | [0,1] |
| Reverse Filtering Result | After reversal, retain instances whose visibility is less than the threshold, and filter out instances whose visibility is greater than the threshold. | Cleared | / |
1.4.11 Remove Overlapping Instances
- Function Introduction
Filter instances whose bounding boxes intersect and overlap
- Use Case
Applicable to scenarios where instance bounding boxes intersect each other
- Parameter Description
| Parameter | Description | Default Value | Range |
|---|---|---|---|
| Bounding Box Overlap Ratio Threshold | Threshold for the ratio of the intersecting area of bounding boxes to the area of the instance bounding box | 0.05 | [0, 1] |
| Filter the Instance with the Larger Bounding Box Area | If selected, the instance with the larger area among two intersecting bounding boxes is filtered out. If cleared, the instance with the smaller area among two intersecting bounding boxes is filtered out. | Selected | / |
- Example

Added Filter enclosed instances. Run with the default values and view bounding box intersections of instances in the log. After instance filtering, 2 instances remain.

According to the log, 12 instances are filtered out because their bounding boxes intersect, leaving 2 instances whose bounding boxes do not intersect.

Set Bounding Box Overlap Ratio Threshold to 0.1 and select Whether to Filter Larger Instances. View the instance filtering process in the log. Nine instances are filtered out because the ratio of the intersection area of their bounding boxes to the instance bounding box area is greater than 0.1. Three instances are retained because the ratio of the intersection area of their bounding boxes to the instance bounding box area is less than 0.1. Two instances have non-intersecting bounding boxes.


Set Bounding Box Overlap Ratio Threshold to 0.1 and clear Whether to Filter Larger Instances. View the instance filtering process in the log. For 9 instances, the ratio of the intersection area of the bounding box to the instance bounding box area is greater than 0.1, but 2 of them are retained because their bounding box areas are smaller than the intersecting instances. Therefore, 7 instances are filtered out. Three instances are retained because the ratio of the intersection area of their bounding boxes to the instance bounding box area is less than 0.1. Two instances have non-intersecting bounding boxes.


1.4.12 [Master] Filter Instances with Concave/Convex Masks Based on Mask / Mask Bounding Polygon Area Ratio
- Function Introduction
Calculate the area ratio of the mask to the circumscribed polygon of the mask. If it is smaller than the set threshold, the instance is filtered out.
- Use Case
Applicable when the Target Object mask has jagged or concave/convex shapes.
- Parameter Description
| Parameter | Description | Default Value | Range |
|---|---|---|---|
| Area Ratio Threshold | Mask / convex hull area ratio threshold. If it is less than the set threshold, the instance is filtered out. | 0.1 | [0,1] |
1.4.13 [Master] Filter Based on Point Cloud Average Distance
- Function Introduction
Filter based on the average distance from points in the point cloud to the fitted plane, removing uneven instance point clouds
- Use Case
Applicable to scenarios where planar Target Object point clouds are bent
- Parameter Description
| Parameter | Description | Default Value | Range | Unit |
|---|---|---|---|---|
| Plane Segmentation Distance Threshold (mm) | Extract a plane from the bent instance point cloud. Points whose distance to the plane is less than this threshold are regarded as points on the plane. | 10 | [-1000, 1000] | mm |
| Average Distance Threshold (mm) | Average distance from points in the instance point cloud to the extracted plane | 20 | [-1000, 1000] | mm |
| Remove Instances Whose Average Distance Is Smaller Than the Threshold | If selected, instances whose average distance from points to the extracted plane is less than the average distance threshold are filtered out. If cleared, instances whose average distance from points to the extracted plane is greater than the average distance threshold are filtered out. | Cleared | / | / |
1.4.14 [Master] Filter Occluded Instances Based on Mask / Bounding Box Area Ratio
- Function Introduction
Calculate the area ratio of the mask to the bounding box. Instances whose ratios are outside the minimum and maximum range are filtered out.
- Use Case
Used to filter instances of occluded Target Objects
- Parameter Description
| Parameter | Description | Default Value | Range |
|---|---|---|---|
| Minimum Area Ratio | Lower limit of the mask / bounding box area ratio range. The smaller the ratio, the more severely the instance is occluded. | 0.1 | [0,1] |
| Maximum Area Ratio | Upper limit of the mask / bounding box area ratio range. The closer the ratio is to 1, the less the instance is occluded. | 1.0 | [0,1] |
1.4.15 [Master] Determine if Top-Layer Instances are fully detected
- Function Introduction
One of the foolproof mechanisms. It determines whether all instances on the topmost layer have been fully detected. If there are top-layer instances that have not been detected, an error is reported and the workflow ends.
- Use Case
Applicable to scenarios where one image is taken and multiple picks are performed, or where picking must be done in sequence, to prevent missed picks from affecting subsequent operations due to incomplete instance detection
- Parameter Description
| Parameter | Description | Default Value | Range | Unit | Parameter Tuning |
|---|---|---|---|---|---|
| Distance Threshold | Used to determine the topmost Target Object. If the distance between a point and the highest point of the Target Object point cloud is smaller than the distance threshold, the point is regarded as belonging to the topmost point cloud. Otherwise, it is regarded as not belonging to the topmost point cloud. | 5 | [0.1, 1000] | mm | It should be smaller than the height of the Target Object |
1.5 Instance Sequence
- Function Introduction
Group, sort, and extract instances according to the selected strategy
- Use Case
Common to depalletizing, random picking, and ordered loading/unloading scenarios
If sorting is not required, there is no need to configure a specific strategy.
1.5.1 Base Coords
- Function Introduction
Set a unified coordinate system for all instances to group and sort instances
- Use Case
Common to depalletizing, random picking, and ordered loading/unloading scenarios
A reference coordinate system should be set before using coordinate-related strategies
- Parameter Description
| Parameter | Description | Illustration |
|---|---|---|
| Camera Coords | The origin of the coordinate system is above the object, and the positive Z-axis points downward; XYZ values are the values of the object center point in this coordinate system. | ![]() |
| ROI Coords | The origin of the coordinate system is approximately at the center of the stack, and the positive Z-axis points upward; XYZ values are the values of the object center point in this coordinate system. | ![]() |
| Robot Coords | The origin of the coordinate system is on the Robot itself, and the positive Z-axis generally points upward; XYZ values are the values of the object center point in this coordinate system. | ![]() |
| Pixel Coords | The origin of the coordinate system is at the top-left vertex of the RGB image, and it is a 2D plane coordinate system; X and Y are the x value and y value of the bbox detection box, and Z is 0. | ![]() |
1.5.2 Picking Strategy
- Parameter Description
| Parameter | Description | Default Value |
|---|---|---|
| Strategy | Select which value is used for grouping and sorting, and how to sort. Multiple criteria can be stacked, including XYZ coordinates of the instance point cloud center, bounding box aspect ratio, distance between the instance point cloud center and the ROI center, and so on. They are executed in sequence. | Instance Point Cloud Center X Coordinate from Small to Large (mm) |
| Grouping Step Size | According to the selected strategy, instances are divided into several groups based on the step size. The grouping step size is the interval between two groups. For example, if the strategy is "Instance Point Cloud Center Z Coordinate from Large to Small (mm)", then the Z coordinates of all instance point cloud centers are first sorted from large to small and then grouped by the step size. The corresponding instances are also divided into several groups. | / |
| Extract First Several Groups | After grouping and sorting, how many groups of instances need to be retained | 10000 |
| Strategy Name | Description | Grouping Step Size | Extract First Several Groups | |
|---|---|---|---|---|
| Default Value | Range | Default Value | ||
| Instance Point Cloud Center XYZ Coordinate from Large to Small / from Small to Large (mm) | Use the XYZ coordinates of each instance point cloud center for grouping and sorting The reference coordinate system should be set before using this strategy for sorting | 200.000 | (0, 10000000] | 10000 |
| From the Middle to Both Sides of the XY Coordinate Axis of the Instance Point Cloud Center / from Both Sides to the Middle of the XY Coordinate Axis of the Instance Point Cloud Center (mm) | Use the XY coordinate values of each instance point cloud center and perform grouping and sorting in the direction of "from the middle to both sides" or "from both sides to the middle" The reference coordinate system should be set before using this strategy for sorting | 200.000 | (0, 10000000] | 10000 |
| Bounding Box Aspect Ratio from Large to Small / from Small to Large | Use the ratio of the longer side to the shorter side of the bounding box for grouping and sorting | 1 | (0, 10000] | 10000 |
| Mask Area from Large to Small / from Small to Large | Use the mask area of each instance for grouping and sorting | 10000 | [1, 10000000] | 10000 |
| Distance from the Instance Point Cloud Center to the ROI Center from Near to Far / from Far to Near (mm) | Use the distance from each instance point cloud center to the center of the ROI coordinate system for grouping and sorting | 200.000 | (0, 10000] | 10000 |
- Example
1.5.3 Custom Grasping Strategy
(1) Function Description
Switch Grasping Strategy to Custom Grasping Strategy, then click Add to add a custom grasping strategy.
Customize the picking order for each Target Object. If it is difficult to achieve picking with the General Grasping Strategy, or if it is difficult to tune suitable parameters because of point cloud noise and other issues, you can consider using the Custom Grasping Strategy.
The Custom Grasping Strategy is applicable to depalletizing and ordered loading/unloading scenarios, but not to random picking scenarios, because the Target Objects used with the Custom Grasping Strategy must be ordered (that is, the order of the Target Objects is fixed).
The Custom Grasping Strategy can only be combined with a single General Grasping Strategy, and the strategy can only be selected as Z coordinate from small to large.
(2) Parameter Description
| Parameter | Description | Default Value | Range | Parameter Tuning |
|---|---|---|---|---|
| IOU Threshold | Represents the overlap threshold between the annotated bbox and the detected bbox. The overlap is used to determine which image's sorting method should be selected for the current Target Object instance sorting. | 0.7 | [0,1] | The larger the threshold, the stricter the matching and the worse the anti-interference capability. Minor shape or position changes may lead to matching failure, or the wrong custom strategy may be matched, resulting in sorting in the wrong order. |
| Pixel Distance Threshold | Represents the size difference between the matchable bbox and the detected bbox. | 100 | [0,1000] | The smaller the threshold, the stricter the matching and the better the anti-interference capability. If the placement of Target Objects between different layers is relatively similar, the wrong custom strategy may still be matched, resulting in an incorrect sorting order. |
(3) Select the Reference Coordinate System
When using the Custom Grasping Strategy, only the camera coordinate system or the pixel coordinate system can be selected
If there are multiple layers of Target Objects, select the camera coordinate system; if there is only one layer of Target Objects, select the pixel coordinate system
(4) Strategy, Grouping Step Size, Extract First Several Groups
| Parameter | Description | Default Value |
|---|---|---|
| Strategy | Only Instance Point Cloud Center Z Coordinate from Large to Small / from Small to Large (mm) can be selected | / |
| Grouping Step Size | According to the strategy of sorting Z coordinates from small to large, the Z coordinates of instances are sorted from small to large, and the instances are divided into several groups according to the step size | 10000 |
| Extract First Several Groups | After grouping and sorting, how many groups of instances need to be retained | 10000 |
(5) Take Photo / Add Local Image
Click Take Photo to acquire an image from the currently connected camera, or click Add Local Image to import an image locally. For each layer or each different placement form of Target Objects, you need to take a photo or add a local image to obtain one corresponding image. If every layer is the same, only one image is needed. Right-click the image to delete it.
On the acquired image, click and hold the left mouse button while dragging to annotate a bbox. The DELETE key can be used to delete annotated bboxes one by one.
2. 3D Computation

2.1 Preprocessing
The Preprocessing step for 3D computation processes the 3D point cloud before performing pose estimation on instances and generating grasp points. The Planar Workpiece Ordered Loading/Unloading (Parallelized) scene does not require 3D point cloud processing.
2.2 Point Cloud Matching Pose Estimation
2.2.1 Template File Path

- Function
Upload the point cloud template to match against the instance point cloud from the scene
- Applicable Scenarios
Planar Workpiece Ordered Loading/Unloading (Parallelized) scene
- Parameter Tuning Instructions
The point cloud template must be created using the template generation script. The procedure is as follows:
- Copy the Scene's 2D Image, Depth Map, and Point Cloud
(1)Select the timestamp of the historical data to copy
Select a timestamp folder from the PickLight historical data folder (e.g. /home/xxx/PickLight/20240718201333036), and copy its full path for later use.
(2) Download the template generation script
Note:
The template generation script must be downloaded in the version corresponding to the software version; otherwise, version incompatibility will cause point cloud template generation to fail.
The download directory for the template generation script must not contain Chinese characters or special characters. It is recommended to store it in the default download directory C:
\Users\dex\Downloads
- Template Generation Script
import argparse
import json
import math
import os
import shutil
from copy import deepcopy
import re
import cv2
import numpy as np
import open3d as o3d
from tqdm import tqdm
from PickLight.Utils.Convertor import generate_mask_from_points
from PickLight.Utils.Convertor import get_ratio_from_mask
from PickLight.Utils.Utility import FileOperation
try:
import glia
if not glia.__version__ >= "0.2.4":
raise RuntimeError("请将 glia 版本升级到 0.2.4 或者更高. 目前版本为 {}".format(glia.__version__))
from glia.dl.models.superglue import SuperGlueMatcher
except ImportError as e:
print(f"警告: {e}")
SuperGlueMatcher = None
try:
import torch
except ImportError:
torch = None
def file_transfer(args):
"""文件复制功能,从两个脚本合并而来"""
input_dir = args.data_dir
output_dir = args.output_dir
os.makedirs(output_dir, exist_ok=True)
# RGB+D 图像
try:
for root, dirs, files in os.walk(input_dir):
target_path = os.path.join(root, 'Builder', 'foreground', 'input')
if os.path.exists(target_path):
for file in os.listdir(target_path):
if file.endswith('.png') or file.endswith('.tiff'):
full_file_path = os.path.join(target_path, file)
shutil.copy(full_file_path, output_dir)
print(f'Copied: {file}')
except Exception as e:
raise ValueError(f"检查{input_dir}路径下是否存在Builder/foreground/input文件夹, {e}")
# PCD 点云文件
try:
for root, dirs, files in os.walk(input_dir):
target_path = os.path.join(root, 'Builder', 'foreground', 'output')
if os.path.exists(target_path):
for file in os.listdir(target_path):
if file.endswith('.ply'):
full_file_path = os.path.join(target_path, file)
shutil.copy(full_file_path, output_dir)
print(f'Copied: {file}')
except Exception as e:
# 尝试第二个脚本的路径
try:
for root, dirs, files in os.walk(input_dir):
target_path = os.path.join(root, 'Builder', 'foreground', 'input')
if os.path.exists(target_path):
for file in os.listdir(target_path):
if file.endswith('.ply'):
full_file_path = os.path.join(target_path, file)
shutil.copy(full_file_path, output_dir)
print(f'Copied: {file}')
except Exception as e2:
raise ValueError(f"检查{input_dir}路径下是否存在Builder/foreground/output或input文件夹, {e2}")
# JSON 配置文件
try:
for root, dirs, files in os.walk(input_dir):
target_path = os.path.join(root, 'Builder', 'foreground', 'input')
if os.path.exists(target_path):
for file in os.listdir(target_path):
# 兼容两个脚本的不同要求
if file.endswith('.json') and (args.type in ['superglue', 'both'] or 'camera_param' in file):
full_file_path = os.path.join(target_path, file)
shutil.copy(full_file_path, output_dir)
print(f'Copied: {file}')
except Exception as e:
raise ValueError(f"检查{input_dir}路径下是否存在JSON文件, {e}")
def _list_indices_by_json(dir_path):
"""列出目录中已有的 model_info_{i}.json 的索引列表"""
if not os.path.isdir(dir_path):
return []
pat = re.compile(r"model_info_(\d+)\.json$")
idxs = []
for f in os.listdir(dir_path):
m = pat.match(f)
if m:
idxs.append(int(m.group(1)))
return sorted(idxs)
def _get_next_index(dir_path):
"""返回可用的下一个索引"""
idxs = _list_indices_by_json(dir_path)
if not idxs:
return 0
return max(idxs) + 1
def _ensure_template_id_in_dir(dir_path, template_id_value):
"""为目录内所有 model_info_{i}.json 写入 template_id(若缺失则补齐)"""
for i in _list_indices_by_json(dir_path):
p = os.path.join(dir_path, f"model_info_{i}.json")
try:
d = json.load(open(p, "r", encoding="utf-8"))
except Exception:
continue
if "template_id" not in d:
d["template_id"] = int(template_id_value)
with open(p, "w", encoding="utf-8") as fw:
json.dump(d, fw, ensure_ascii=False, indent=4)
def _is_superglue_template_dir(dir_path):
"""粗略校验是否为superglue模板目录"""
if not os.path.isdir(dir_path):
return False
js = _list_indices_by_json(dir_path)
if not js:
return False
# 至少有 temp_0.png 或 depth_model_0.tiff
has_temp = any(os.path.exists(os.path.join(dir_path, f"temp_{i}.png")) for i in js)
has_depth = any(os.path.exists(os.path.join(dir_path, f"depth_model_{i}.tiff")) for i in js)
return has_temp or has_depth
def _copy_one_template_block(src_dir, src_idx, dst_dir, dst_idx, template_id):
"""将一套 index 对应的模板文件从src复制到dst,并改名为新索引;同时写入 template_id"""
patterns = [
("temp_{i}.png", "temp_{j}.png"),
("depth_model_{i}.tiff", "depth_model_{j}.tiff"),
("model_{i}.ply", "model_{j}.ply"),
("model_{i}_downsampled.ply", "model_{j}_downsampled.ply"),
("keypoints_{i}.ply", "keypoints_{j}.ply"),
("temp_vis_{i}.jpg", "temp_vis_{j}.jpg"),
("depth_mask_uint8_{i}.png", "depth_mask_uint8_{j}.png"),
]
os.makedirs(dst_dir, exist_ok=True)
for src_pat, dst_pat in patterns:
s = os.path.join(src_dir, src_pat.format(i=src_idx))
d = os.path.join(dst_dir, dst_pat.format(j=dst_idx))
if os.path.exists(s):
shutil.copy(s, d)
# 处理 model_info
src_info = os.path.join(src_dir, f"model_info_{src_idx}.json")
dst_info = os.path.join(dst_dir, f"model_info_{dst_idx}.json")
if os.path.exists(src_info):
try:
data = json.load(open(src_info, "r", encoding="utf-8"))
except Exception:
data = {}
data["template_id"] = int(template_id)
with open(dst_info, "w", encoding="utf-8") as f:
json.dump(data, f, ensure_ascii=False, indent=4)
def merge_superglue_folders(target_dir, source_dirs):
"""功能1:合并多个superglue模板目录到target_dir,顺序追加并写入template_id"""
if not _is_superglue_template_dir(target_dir):
raise RuntimeError(f"目标路径不是有效的superglue模板目录: {target_dir}")
for s in source_dirs:
if not _is_superglue_template_dir(s):
raise RuntimeError(f"来源路径不是有效的superglue模板目录: {s}")
# 目标已有模板写入 template_id=0(若缺失)
_ensure_template_id_in_dir(target_dir, 0)
next_idx = _get_next_index(target_dir)
# 从1开始给后续目录分配 template_id
for folder_tid, src in enumerate(source_dirs, start=1):
src_indices = _list_indices_by_json(src)
for i in src_indices:
_copy_one_template_block(src, i, target_dir, next_idx, folder_tid)
next_idx += 1
print(f"合并完成,输出在: {target_dir}")
def _detect_group_dirs(root_dir):
"""功能2:检测 root_dir 下是否存在多组数据(按子目录作为一组)"""
group_dirs = []
for name in sorted(os.listdir(root_dir)):
p = os.path.join(root_dir, name)
if not os.path.isdir(p):
continue
# 需要同时包含 ply + 任一rgb + tiff + json(camera_param)
has_ply = any(fn.endswith(".ply") for fn in os.listdir(p))
has_img = any(fn.lower().endswith((".png", ".jpg", ".jpeg", ".bmp")) for fn in os.listdir(p))
has_tiff = any(fn.lower().endswith(".tiff") for fn in os.listdir(p))
has_json = False
for fn in os.listdir(p):
if fn.lower().endswith(".json"):
try:
d = json.load(open(os.path.join(p, fn), "r", encoding="utf-8"))
if "camera_param" in d:
has_json = True
break
except Exception:
pass
if has_ply and has_img and has_tiff and has_json:
group_dirs.append(p)
return group_dirs
def search_depth_values(indices, depth_mask, search_radius=3):
"""
在附近搜索一个非零的深度值并填充到深度掩码中
参数:
- indices: 关键点的索引数组,形状为 (N, 2)
- depth_mask: 初始深度掩码,形状为 (H, W)
- search_radius: 搜索半径,默认为 3
返回:
- depth_values: 每个关键点搜索到的深度值
"""
depth_values = np.full(indices.shape[0], -1, dtype=np.float32)
for idx, (x, y) in enumerate(indices):
if depth_mask[y, x] == 0:
xmin, xmax = max(0, x - search_radius), min(depth_mask.shape[1], x + search_radius + 1)
ymin, ymax = max(0, y - search_radius), min(depth_mask.shape[0], y + search_radius + 1)
search_area = depth_mask[ymin:ymax, xmin:xmax]
non_zero = search_area[search_area != 0]
if non_zero.size > 0:
depth_values[idx] = non_zero[0]
else:
depth_values[idx] = depth_mask[y, x]
return depth_values
def rotate_pcd(pcd, angle, original_center=None):
"""旋转点云"""
if original_center is None:
original_center = pcd.get_center()
t_1 = np.array(
[[1, 0, 0, -original_center[0]], [0, 1, 0, -original_center[1]], [0, 0, 1, -original_center[2]], [0, 0, 0, 1]]
)
theta = np.radians(angle)
cos_theta = np.cos(theta)
sin_theta = np.sin(theta)
t_2 = np.array([[cos_theta, -sin_theta, 0, 0], [sin_theta, cos_theta, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1]])
t_3 = np.array(
[[1, 0, 0, original_center[0]], [0, 1, 0, original_center[1]], [0, 0, 1, original_center[2]], [0, 0, 0, 1]]
)
transform_all = np.dot(t_3, np.dot(t_2, t_1))
pcd_final = deepcopy(pcd).transform(transform_all)
return pcd_final, transform_all
def project_pcd_to_rgb(
input_dir,
output_dir,
angle,
auto_scale,
mask_kernel_size,
edge_kernel_size,
index=1,
use_edge=False,
background_color=0,
foreground_color=255,
template_id=None,
base_index=0,
):
"""将点云投影到RGB图像(用于SuperGlue模板)"""
os.makedirs(output_dir, exist_ok=True)
# 使用 base_index 读取基准模板
cam_k_path = os.path.join(input_dir, f"model_info_{base_index}.json")
cam_k = json.load(open(cam_k_path))
cam_k = cam_k["camera_param"]
cam_k = np.asarray(cam_k).reshape(3, 3).astype(np.float32)
pcd_path = os.path.join(input_dir, f"model_{base_index}.ply")
pcd = o3d.io.read_point_cloud(pcd_path)
temp_img_path = os.path.join(input_dir, f"temp_{base_index}.png")
temp_img = cv2.imread(temp_img_path)
project_img = np.zeros(temp_img.shape, dtype=np.uint8)
project_depth = np.zeros(temp_img.shape[:2], dtype=np.float32)
aabb = pcd.get_axis_aligned_bounding_box()
aabb_center = (aabb.get_min_bound() + aabb.get_max_bound()) / 2
pcd_transformed, transform_all = rotate_pcd(pcd, angle, aabb_center)
o3d.io.write_point_cloud(os.path.join(output_dir, f"model_{index}.ply"), pcd_transformed)
f_json_data = open(os.path.join(output_dir, f"model_info_{index}.json"), "w+")
json_saved_data = {}
json_saved_data["camera_param"] = cam_k.tolist()
json_saved_data["transform"] = transform_all.tolist()
json_saved_data["angle"] = angle
json_saved_data["mask_kernel_size"] = mask_kernel_size
json_saved_data["edge_kernel_size"] = edge_kernel_size
json_saved_data["use_edge"] = use_edge
if template_id is not None:
json_saved_data["template_id"] = int(template_id)
json.dump(json_saved_data, f_json_data, indent=4)
f_json_data.close()
rvec = np.array([0.0, 0.0, 0.0])
tvec = np.array([0.0, 0.0, 0.0])
distortion_zeros = np.zeros((5, 1), dtype=np.float32)
pcd_np = np.array(pcd_transformed.points)
points_2d, _ = cv2.projectPoints(pcd_np, rvec, tvec, cam_k, distortion_zeros)
points_2d = points_2d.squeeze(1).reshape(-1, 2)
color_bgr = np.asarray(pcd_transformed.colors)[:, ::-1] * 255
for i, pt in enumerate(points_2d):
project_img[round(pt[1]), round(pt[0]), :] = color_bgr[i]
project_depth[round(pt[1]), round(pt[0])] = pcd_np[i][2]
mask = np.any(project_img != 0, axis=2)
project_img = cv2.cvtColor(project_img, cv2.COLOR_BGR2GRAY)
project_img[project_img == 0] = background_color
if use_edge:
project_img[mask] = foreground_color
project_img = cv2.morphologyEx(
project_img, cv2.MORPH_CLOSE, np.ones((mask_kernel_size, mask_kernel_size), np.uint8)
)
project_img = cv2.cvtColor(project_img, cv2.COLOR_GRAY2BGR)
cv2.imwrite(os.path.join(output_dir, f"depth_model_{index}.tiff"), project_depth)
cv2.imwrite(os.path.join(output_dir, f"temp_{index}.png"), project_img)
concat_img = cv2.hconcat([project_img, temp_img])
cv2.imwrite(os.path.join(output_dir, f"project_img_concat_{index}.jpg"), concat_img)
# 输出 keypoints.ply
if SuperGlueMatcher is not None and torch is not None:
device = 'cuda' if torch.cuda.is_available() else 'cpu'
matching = SuperGlueMatcher().eval().to(device)
matching.set_edge_mode(use_edge)
matching.set_edge_kernel_size(edge_kernel_size)
matching.register(project_img, resolution=(project_img.shape[0], project_img.shape[1]))
keypoints_model_2d = matching.temp_data['keypoints0'][0].cpu().numpy().astype(np.int32)
Keypoint_3D_model = np.zeros((len(keypoints_model_2d), 3), dtype=np.float32)
depth_values = search_depth_values(keypoints_model_2d, project_depth, 5)
def get_rainbow_color(index, total_points):
hsv = np.array([[[int(255 * index / total_points), 255, 255]]], dtype=np.uint8)
bgr = cv2.cvtColor(hsv, cv2.COLOR_HSV2BGR)[0][0]
return tuple(map(int, bgr))
temp_vis = deepcopy(project_img)
for i in range(len(keypoints_model_2d)):
x, y = keypoints_model_2d[i]
z = depth_values[i]
if z == -1:
Keypoint_3D_model[i] = [0, 0, 0]
continue
Keypoint_3D_model[i] = np.linalg.inv(cam_k) @ (np.array([x, y, 1]) * z)
color = get_rainbow_color(i, len(keypoints_model_2d))
center = (int(round(x)), int(round(y)))
cv2.circle(temp_vis, center, radius=1, color=color, thickness=-1)
cv2.imwrite(os.path.join(output_dir, f"temp_vis_{index}.jpg"), temp_vis)
rotated_model_kpts = o3d.geometry.PointCloud()
rotated_model_kpts.points = o3d.utility.Vector3dVector(Keypoint_3D_model)
rotated_model_kpts.colors = o3d.utility.Vector3dVector([[0, 1, 0] for _ in range(Keypoint_3D_model.shape[0])])
o3d.io.write_point_cloud(os.path.join(output_dir, f"keypoints_{index}.ply"), rotated_model_kpts)
if torch.cuda.is_available():
torch.cuda.empty_cache()
return True
def generate_ism_template(args):
"""生成ISM模板(第一个脚本的功能)"""
data_dir = args.data_dir
rotation_range = args.rot_range
rotation_interval = args.rot_interval
rotation_range_xy = args.rot_range_xy
depth_range = args.depth_range
depth_interval = args.depth_interval
template_path = None
ply_path = None
pcd = None
# 查找所需文件
for f in os.listdir(data_dir):
template_path = os.path.join(data_dir, "template")
os.makedirs(template_path, exist_ok=True)
f_path = os.path.join(data_dir, f)
if f.endswith('.ply'):
ply_path = f_path
pcd = o3d.io.read_point_cloud(ply_path)
pcd.paint_uniform_color([0, 1.0, 0])
o3d.io.write_point_cloud(f"{template_path}/model.ply", pcd)
if f.endswith('.png') or f.endswith('.jpg') or f.endswith('.bmp'):
rgb = cv2.imread(f_path)
if len(rgb.shape) == 2:
rgb = cv2.cvtColor(rgb, cv2.COLOR_GRAY2RGB)
h, w, _ = rgb.shape
if f.endswith('.tiff'):
depth = cv2.imread(f_path, -1)
if f.endswith('.json'):
resource_manager = FileOperation.load_json(f_path)
cam_k = resource_manager["camera_param"]
cam_k = np.asarray(cam_k).reshape(3, 3).astype(np.float32)
# 检查文件是否存在
if ply_path is None:
raise ValueError(f"错误! 未在 {data_dir} 路径下找到 点云 文件, 支持后缀为 ply.\n")
if pcd is None:
raise ValueError(f"错误! {ply_path} 路径的点云文件读取失败.\n")
try:
rgb
except NameError:
raise ValueError(f"错误! 未能在 {data_dir} 路径下正确读取 彩色图像 文件, 支持后缀为 png/jpg/bmp.\n")
try:
depth
except NameError:
raise ValueError(f"错误! 未能在 {data_dir} 路径下正确读取 深度图像 文件, 支持后缀为 tiff.\n")
try:
cam_k
except NameError:
raise ValueError(f"错误! 未在 {data_dir} 路径下找到 配置参数 文件, 支持后缀为 json.\n")
# ISM prompt
pcd_center = pcd.get_center()
angles_z = range(-rotation_range, rotation_range, rotation_interval)
mask = generate_mask_from_points(np.array(pcd.points), cam_k, h, w, auto_scale=1)
# ISM prompt
mask_uint8 = cv2.normalize(mask, None, 0, 255, cv2.NORM_MINMAX).astype(np.uint8)
rgb_mask = deepcopy(rgb)
rgb_mask[mask == 0] = 0
# 将 mask 与 rgb 的前景质心平移到图像中心
coords = np.column_stack(np.where(mask_uint8 > 0))
if coords.size > 0:
cy, cx = coords.mean(axis=0)
center_x = w / 2.0
center_y = h / 2.0
dx = center_x - cx
dy = center_y - cy
T = np.float32([[1, 0, dx], [0, 1, dy]])
rgb_mask = cv2.warpAffine(
rgb_mask, T, (w, h), flags=cv2.INTER_LINEAR, borderMode=cv2.BORDER_CONSTANT, borderValue=(0, 0, 0)
)
mask_uint8 = cv2.warpAffine(
mask_uint8, T, (w, h), flags=cv2.INTER_NEAREST, borderMode=cv2.BORDER_CONSTANT, borderValue=0
)
if template_path is not None:
center = (w / 2, h / 2)
new_w = int(math.sqrt(w**2 + h**2))
new_h = new_w
main_i = 0
for angle in tqdm(angles_z, desc="生成ISM旋转模板"):
M = cv2.getRotationMatrix2D(center, angle, 1.0)
M[0, 2] += (new_w - w) / 2
M[1, 2] += (new_h - h) / 2
# 生成rgb
rotated_rgb = cv2.warpAffine(
rgb_mask, M, (new_w, new_h), borderMode=cv2.BORDER_REPLICATE, borderValue=(127, 127, 127)
)
save_name_rgb = f"rgb_{main_i}.png"
save_path_rgb = os.path.join(template_path, save_name_rgb)
cv2.imwrite(save_path_rgb, rotated_rgb)
# 生成mask
rotated_mask = cv2.warpAffine(
mask_uint8, M, (new_w, new_h), borderMode=cv2.BORDER_REPLICATE, borderValue=(127, 127, 127)
)
save_name_mask = f"mask_{main_i}.png"
save_path_mask = os.path.join(template_path, save_name_mask)
cv2.imwrite(save_path_mask, rotated_mask)
main_i += 1
angles_xy = range(-rotation_range_xy, rotation_range_xy + 1, rotation_range_xy) if rotation_range_xy > 0 else [0]
# 设置深度变化范围 (mm to meters)
if depth_interval > 0:
depth_shifts = np.arange(-depth_range, depth_range + depth_interval, depth_interval) / 1000.0
else:
depth_shifts = [0]
geometric_features_model = []
variants_meta = []
i = 0
for angle_z in tqdm(angles_z, desc="生成ISM几何特征"):
for angle_x in angles_xy:
for angle_y in angles_xy:
pcd_rotated = deepcopy(pcd)
# Z轴旋转
Rz = pcd_rotated.get_rotation_matrix_from_xyz((0, 0, np.deg2rad(angle_z)))
pcd_rotated.rotate(Rz, center=pcd_center)
# X轴小角度旋转
if angle_x != 0:
Rx = pcd_rotated.get_rotation_matrix_from_xyz((np.deg2rad(angle_x), 0, 0))
pcd_rotated.rotate(Rx, center=pcd_center)
# Y轴小角度旋转
if angle_y != 0:
Ry = pcd_rotated.get_rotation_matrix_from_xyz((0, np.deg2rad(angle_y), 0))
pcd_rotated.rotate(Ry, center=pcd_center)
for depth_shift in depth_shifts:
pcd_final = deepcopy(pcd_rotated)
if depth_shift != 0:
pcd_final.translate((0, 0, depth_shift))
# 从旋转和平移后的点云生成mask
mask = generate_mask_from_points(np.array(pcd_final.points), cam_k, h, w, auto_scale=2)
mask_uint8 = cv2.normalize(mask, None, 0, 255, cv2.NORM_MINMAX).astype(np.uint8)
obb_ratio = get_ratio_from_mask(mask_uint8)
mask_area = int(np.count_nonzero(mask_uint8 > 0))
if obb_ratio > 0 and mask_area > 0:
geometric_features_model.append([float(obb_ratio), float(mask_area)])
variants_meta.append(
{
"idx": i,
"angle_z": angle_z,
"angle_x": angle_x,
"angle_y": angle_y,
"depth_shift_m": float(depth_shift),
"obb_ratio": float(obb_ratio),
"mask_area": mask_area,
}
)
i += 1
meta = {"geometric_features_model": geometric_features_model, "variants": variants_meta, "num_variants": i}
with open(os.path.join(template_path, "prompt_meta.json"), "w", encoding="utf-8") as f:
json.dump(meta, f, ensure_ascii=False, indent=2)
print(f"完成ISM模板生成: 元数据写入 {template_path}/prompt_meta.json")
return template_path
def generate_superglue_template(args, start_index=0, template_id=None):
"""生成SuperGlue模板(第二个脚本的功能)"""
if SuperGlueMatcher is None:
raise ImportError("无法导入SuperGlueMatcher,请确保glia版本正确")
if torch is None:
raise ImportError("无法导入torch,请安装PyTorch")
data_dir = args.data_dir
auto_scale = args.auto_scale
mask_kernel_size = args.mask_kernel_size
edge_kernel_size = args.edge_kernel_size
background_color = args.background_color
foreground_color = args.foreground_color
voxel_size = args.down_sample
ply_path = None
pcd = None
# 查找所需文件
for f in os.listdir(data_dir):
f_path = os.path.join(data_dir, f)
if f.endswith('.ply'):
ply_path = f_path
pcd = o3d.io.read_point_cloud(ply_path)
if f.endswith('.png') or f.endswith('.jpg') or f.endswith('.bmp'):
rgb = cv2.imread(f_path)
h, w = rgb.shape[0], rgb.shape[1]
if f.endswith('.tiff'):
depth = cv2.imread(f_path, -1)
if f.endswith('.json'):
resource_manager = json.load(open(f_path))
cam_k = resource_manager["camera_param"]
cam_k = np.asarray(cam_k).reshape(3, 3).astype(np.float32)
# 检查文件是否存在
if ply_path is None:
raise ValueError(f"错误! 未在 {data_dir} 路径下找到 点云 文件, 支持后缀为 ply.\n")
if pcd is None:
raise ValueError(f"错误! {ply_path} 路径的点云文件读取失败.\n")
try:
rgb
except NameError:
raise ValueError(f"错误! 未能在 {data_dir} 路径下正确读取 彩色图像 文件, 支持后缀为 png/jpg/bmp.\n")
try:
depth
except NameError:
raise ValueError(f"错误! 未能在 {data_dir} 路径下正确读取 深度图像 文件, 支持后缀为 tiff.\n")
try:
cam_k
except NameError:
raise ValueError(f"错误! 未在 {data_dir} 路径下找到 配置参数 文件, 支持后缀为 json.\n")
mask = generate_mask_from_points(
np.array(pcd.points), cam_k, h, w, kernel_size=mask_kernel_size, auto_scale=auto_scale
)
mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, np.ones((mask_kernel_size, mask_kernel_size), np.uint8))
if rgb.shape[-1] == 3:
rgb = cv2.cvtColor(rgb, cv2.COLOR_BGR2GRAY)
rgb_mask = deepcopy(rgb)
rgb_mask[mask == 0] = background_color
if args.use_edge:
rgb_mask[mask != 0] = foreground_color
depth_mask = deepcopy(depth)
# 裁剪图像
expand_pixels = 10
coords = cv2.findNonZero(mask)
if coords is None or len(coords) == 0:
raise ValueError("mask 中没有非零点,无法进行裁剪")
x_bbox, y_bbox, w_bbox, h_bbox = cv2.boundingRect(coords)
x_expanded = max(0, x_bbox - expand_pixels)
y_expanded = max(0, y_bbox - expand_pixels)
w_expanded = min(w - x_expanded, w_bbox + 2 * expand_pixels)
h_expanded = min(h - y_expanded, h_bbox + 2 * expand_pixels)
rgb_mask = rgb_mask[y_expanded : y_expanded + h_expanded, x_expanded : x_expanded + w_expanded]
depth_mask = depth_mask[y_expanded : y_expanded + h_expanded, x_expanded : x_expanded + w_expanded]
diameter = math.ceil(np.sqrt(w_expanded * w_expanded + h_expanded * h_expanded))
x_pad = (diameter - w_expanded) // 2
y_pad = (diameter - h_expanded) // 2
rgb_mask_paded = np.zeros([diameter, diameter], dtype=np.uint8)
depth_mask_paded = np.zeros([diameter, diameter], dtype=np.float32)
rgb_mask_paded[y_pad : y_pad + h_expanded, x_pad : x_pad + w_expanded] = rgb_mask
depth_mask_paded[y_pad : y_pad + h_expanded, x_pad : x_pad + w_expanded] = depth_mask
rgb_mask = deepcopy(rgb_mask_paded)
depth_mask = deepcopy(depth_mask_paded)
# 更新相机内参矩阵
cam_k[0, 2] = cam_k[0, 2] - x_expanded + x_pad
cam_k[1, 2] = cam_k[1, 2] - y_expanded + y_pad
# 统一输出目录:优先 args.output_superglue_dir,否则 data_dir/superglue
superglue_path = getattr(args, "output_superglue_dir", None) or os.path.join(data_dir, "superglue")
os.makedirs(superglue_path, exist_ok=True)
index = int(start_index)
base_index = index
use_depth = args.use_depth
use_edge = args.use_edge
# 输出 model.ply
o3d.io.write_point_cloud(f"{superglue_path}/model_{index}.ply", pcd)
if voxel_size is not None:
pcd_down_sampled = deepcopy(pcd).voxel_down_sample(voxel_size=voxel_size)
o3d.io.write_point_cloud(f"{superglue_path}/model_{index}_downsampled.ply", pcd_down_sampled)
# 注册模版输出 temp.png keypoints.ply
temp = deepcopy(rgb_mask)
if use_depth:
temp = deepcopy(depth_mask)
temp = cv2.normalize(temp, None, 0, 255, cv2.NORM_MINMAX).astype(np.uint8)
cv2.imwrite(f"{superglue_path}/depth_mask_uint8_{index}.png", temp)
cv2.imwrite(f"{superglue_path}/depth_model_{index}.tiff", depth_mask)
cv2.imwrite(f"{superglue_path}/temp_{index}.png", temp)
# 输出 keypoints.ply
device = 'cuda' if torch.cuda.is_available() else 'cpu'
matching = SuperGlueMatcher().eval().to(device)
if use_edge:
matching.set_edge_mode(True)
matching.set_edge_kernel_size(edge_kernel_size)
matching.register(temp, resolution=(temp.shape[0], temp.shape[1]))
keypoints_model_2d = matching.temp_data['keypoints0'][0].cpu().numpy().astype(np.int32)
Keypoint_3D_model = np.zeros((len(keypoints_model_2d), 3), dtype=np.float32)
depth_values = search_depth_values(keypoints_model_2d, depth_mask, 5)
temp_vis = deepcopy(rgb_mask)
temp_vis = cv2.cvtColor(temp_vis, cv2.COLOR_GRAY2BGR)
def get_rainbow_color(index, total_points):
hsv = np.array([[[int(255 * index / total_points), 255, 255]]], dtype=np.uint8)
bgr = cv2.cvtColor(hsv, cv2.COLOR_HSV2BGR)[0][0]
return tuple(map(int, bgr))
for i in range(len(keypoints_model_2d)):
x, y = keypoints_model_2d[i]
z = depth_values[i]
if z == -1:
Keypoint_3D_model[i] = [0, 0, 0]
continue
Keypoint_3D_model[i] = np.linalg.inv(cam_k) @ (np.array([x, y, 1]) * z)
color = get_rainbow_color(i, len(keypoints_model_2d))
center = (int(round(x)), int(round(y)))
cv2.circle(temp_vis, center, radius=1, color=color, thickness=-1)
cv2.imwrite(f"{superglue_path}/temp_vis_{index}.jpg", temp_vis)
pcd_model_keypoints = o3d.geometry.PointCloud()
pcd_model_keypoints.points = o3d.utility.Vector3dVector(Keypoint_3D_model)
pcd_model_keypoints.colors = o3d.utility.Vector3dVector([[0, 1, 0] for _ in range(Keypoint_3D_model.shape[0])])
o3d.io.write_point_cloud(f"{superglue_path}/keypoints_{index}.ply", pcd_model_keypoints)
f_json_data = open(os.path.join(superglue_path, f"model_info_{index}.json"), "w+")
json_saved_data = {}
json_saved_data["camera_param"] = cam_k.tolist()
json_saved_data["transform"] = np.eye(4).tolist()
json_saved_data["angle"] = 0.0
json_saved_data["mask_kernel_size"] = mask_kernel_size
json_saved_data["edge_kernel_size"] = edge_kernel_size
json_saved_data["use_edge"] = args.use_edge
if template_id is not None:
json_saved_data["template_id"] = int(template_id)
json.dump(json_saved_data, f_json_data, indent=4)
f_json_data.close()
if args.multi_temp:
for k, angle in enumerate(args.angle):
project_pcd_to_rgb(
superglue_path,
superglue_path,
angle,
args.auto_scale,
mask_kernel_size,
edge_kernel_size,
index=base_index + 1 + k,
use_edge=args.use_edge,
background_color=background_color,
foreground_color=foreground_color,
template_id=template_id,
base_index=base_index,
)
# 组别的第0个模版点云另存为 multi_model_{template_id}.ply(若启用多模版)
if template_id is not None:
src = os.path.join(superglue_path, f"model_{base_index}.ply")
dst = os.path.join(superglue_path, f"multi_model_{template_id}.ply")
if os.path.exists(src):
shutil.copy(src, dst)
if voxel_size is not None:
# 生成 multi_model_{template_id}_downsampled.ply
pcd0 = o3d.io.read_point_cloud(src)
pcd0_ds = deepcopy(pcd0).voxel_down_sample(voxel_size=voxel_size)
o3d.io.write_point_cloud(
os.path.join(superglue_path, f"multi_model_{template_id}_downsampled.ply"), pcd0_ds
)
if torch.cuda.is_available():
torch.cuda.empty_cache()
print(f"完成SuperGlue模板生成: 文件保存在 {superglue_path}")
# 返回下一个可用索引(使用 base_index 计算)
end_next_index = base_index + (len(args.angle) + 1 if args.multi_temp else 1)
return superglue_path, end_next_index
def main():
parser = argparse.ArgumentParser(description="融合脚本:生成ISM和/或SuperGlue模板")
parser.add_argument(
"--output_superglue_dir",
type=str,
default=None,
help="指定SuperGlue模板统一输出目录;不指定则写入 data_dir/superglue",
)
# 基础参数
parser.add_argument("data_dir", help="输入数据目录")
parser.add_argument(
"--type",
choices=["ism", "superglue", "both"],
default="both",
help="选择生成的模板类型: ism, superglue, 或 both",
)
parser.add_argument("--file_transfer", action="store_true", default=False, help="执行文件复制操作")
parser.add_argument("--output_dir", help="文件复制时的输出目录")
# ISM参数
parser.add_argument("--rot_range", type=int, default=60, help="Z-axis rotation range (degrees)")
parser.add_argument("--rot_interval", type=int, default=10, help="Z-axis rotation interval (degrees)")
parser.add_argument("--rot_range_xy", type=int, default=5, help="XY small rotation range (degrees)")
parser.add_argument("--depth_range", type=int, default=200, help="Z translation range (mm)")
parser.add_argument("--depth_interval", type=int, default=200, help="Z translation interval (mm)")
# SuperGlue参数
parser.add_argument("--auto_scale", type=float, default=0.1, help="点云到掩码的缩放比例")
parser.add_argument("--use_depth", action="store_true", default=False, help="使用深度图")
parser.add_argument("--use_edge", action="store_true", default=False, help="使用边缘模式")
parser.add_argument("--edge_kernel_size", type=int, default=3, help="边缘核大小")
parser.add_argument("--mask_kernel_size", type=int, default=5, help="形态学操作核大小")
parser.add_argument("--multi_temp", action="store_true", default=False, help="生成多角度模板")
parser.add_argument("--angle", type=float, nargs="+", default=[90, 180, 270], help="旋转角度")
parser.add_argument("--down_sample", type=float, default=None, help="点云下采样体素大小(米), 设置为None禁用")
parser.add_argument("--background_color", type=int, default=0, help="边缘模式背景颜色")
parser.add_argument("--foreground_color", type=int, default=255, help="边缘模式前景颜色")
# 新增:多模版模式
parser.add_argument("--multi_template", action="store_true", default=False, help="启用SuperGlue多模版模式")
parser.add_argument(
"--superglue_dirs",
nargs="+",
default=[],
help="多个superglue模板目录(功能1:合并多个已生成的superglue模板目录)",
)
args = parser.parse_args()
# 执行文件复制
if args.file_transfer:
if not args.output_dir:
raise ValueError("执行文件复制时需要指定 --output_dir 参数")
print("开始文件复制...")
file_transfer(args)
print(f"文件复制完成,文件保存在: {args.output_dir}")
return
# 生成ISM
arg_ism = deepcopy(args)
if args.type in ["ism", "both"]:
def _first_subdir(root: str) -> str:
try:
subs = sorted([e.path for e in os.scandir(root) if e.is_dir()])
return subs[0] if subs else root
except Exception:
return root
if getattr(args, "multi_template", False):
arg_ism.data_dir = _first_subdir(arg_ism.data_dir)
print("开始生成ISM模板...")
try:
ism_path = generate_ism_template(arg_ism)
print(f"ISM模板生成完成,保存在: {ism_path}")
except Exception as e:
print(f"生成ISM模板时出错: {e}")
if arg_ism.type == "ism":
raise
# 生成/处理SuperGlue
if args.type in ["superglue", "both"]:
print("开始生成/处理SuperGlue模板...")
try:
if args.multi_template:
# 功能1:合并多个superglue目录
if len(args.superglue_dirs) >= 2:
target_dir = args.superglue_dirs[0]
source_dirs = args.superglue_dirs[1:]
merge_superglue_folders(target_dir, source_dirs)
print(f"SuperGlue多目录合并完成,输出: {target_dir}")
else:
# 功能2:单目录多组 -> 统一写入一个目录,并从末尾连续编号
group_dirs = _detect_group_dirs(args.data_dir)
if len(group_dirs) >= 1:
out_dir = args.output_superglue_dir or os.path.join(args.data_dir, "superglue")
os.makedirs(out_dir, exist_ok=True)
start_idx = _get_next_index(out_dir)
next_idx = start_idx
for gid, gdir in enumerate(group_dirs):
args_one = deepcopy(args)
args_one.data_dir = gdir
args_one.output_superglue_dir = out_dir # 统一输出目录
_, next_idx = generate_superglue_template(args_one, start_index=next_idx, template_id=gid)
print(f"SuperGlue多组生成完成,输出: {out_dir}")
else:
# 常规单组生成,但写入 template_id=0,也支持统一输出目录
out_dir = args.output_superglue_dir or os.path.join(args.data_dir, "superglue")
os.makedirs(out_dir, exist_ok=True)
_, _ = generate_superglue_template(
args,
start_index=_get_next_index(out_dir),
template_id=0,
)
else:
# 原有逻辑:单组生成(不写template_id字段)
_ = generate_superglue_template(args) # 返回值未使用
print("SuperGlue模板生成完成")
except Exception as e:
print(f"生成/处理SuperGlue模板时出错: {e}")
if args.type == "superglue":
raise
if args.type == "both":
print("所有模板生成完成!")
print(f"ISM模板保存在: {os.path.join(arg_ism.data_dir, 'template')}")
print(f"SuperGlue模板保存在: {os.path.join(args.data_dir, 'superglue')}")
if __name__ == "__main__":
main()import argparse
import json
import math
import os
import shutil
from copy import deepcopy
import re
import cv2
import numpy as np
import open3d as o3d
from tqdm import tqdm
from PickLight.Utils.Convertor import generate_mask_from_points
from PickLight.Utils.Convertor import get_ratio_from_mask
from PickLight.Utils.Utility import FileOperation
try:
import glia
if not glia.__version__ >= "0.2.4":
raise RuntimeError("请将 glia 版本升级到 0.2.4 或者更高. 目前版本为 {}".format(glia.__version__))
from glia.dl.models.superglue import SuperGlueMatcher
except ImportError as e:
print(f"警告: {e}")
SuperGlueMatcher = None
try:
import torch
except ImportError:
torch = None
def file_transfer(args):
"""文件复制功能,从两个脚本合并而来"""
input_dir = args.data_dir
output_dir = args.output_dir
os.makedirs(output_dir, exist_ok=True)
# RGB+D 图像
try:
for root, dirs, files in os.walk(input_dir):
target_path = os.path.join(root, 'Builder', 'foreground', 'input')
if os.path.exists(target_path):
for file in os.listdir(target_path):
if file.endswith('.png') or file.endswith('.tiff'):
full_file_path = os.path.join(target_path, file)
shutil.copy(full_file_path, output_dir)
print(f'Copied: {file}')
except Exception as e:
raise ValueError(f"检查{input_dir}路径下是否存在Builder/foreground/input文件夹, {e}")
# PCD 点云文件
try:
for root, dirs, files in os.walk(input_dir):
target_path = os.path.join(root, 'Builder', 'foreground', 'output')
if os.path.exists(target_path):
for file in os.listdir(target_path):
if file.endswith('.ply'):
full_file_path = os.path.join(target_path, file)
shutil.copy(full_file_path, output_dir)
print(f'Copied: {file}')
except Exception as e:
# 尝试第二个脚本的路径
try:
for root, dirs, files in os.walk(input_dir):
target_path = os.path.join(root, 'Builder', 'foreground', 'input')
if os.path.exists(target_path):
for file in os.listdir(target_path):
if file.endswith('.ply'):
full_file_path = os.path.join(target_path, file)
shutil.copy(full_file_path, output_dir)
print(f'Copied: {file}')
except Exception as e2:
raise ValueError(f"检查{input_dir}路径下是否存在Builder/foreground/output或input文件夹, {e2}")
# JSON 配置文件
try:
for root, dirs, files in os.walk(input_dir):
target_path = os.path.join(root, 'Builder', 'foreground', 'input')
if os.path.exists(target_path):
for file in os.listdir(target_path):
# 兼容两个脚本的不同要求
if file.endswith('.json') and (args.type in ['superglue', 'both'] or 'camera_param' in file):
full_file_path = os.path.join(target_path, file)
shutil.copy(full_file_path, output_dir)
print(f'Copied: {file}')
except Exception as e:
raise ValueError(f"检查{input_dir}路径下是否存在JSON文件, {e}")
def _list_indices_by_json(dir_path):
"""列出目录中已有的 model_info_{i}.json 的索引列表"""
if not os.path.isdir(dir_path):
return []
pat = re.compile(r"model_info_(\d+)\.json$")
idxs = []
for f in os.listdir(dir_path):
m = pat.match(f)
if m:
idxs.append(int(m.group(1)))
return sorted(idxs)
def _get_next_index(dir_path):
"""返回可用的下一个索引"""
idxs = _list_indices_by_json(dir_path)
if not idxs:
return 0
return max(idxs) + 1
def _ensure_template_id_in_dir(dir_path, template_id_value):
"""为目录内所有 model_info_{i}.json 写入 template_id(若缺失则补齐)"""
for i in _list_indices_by_json(dir_path):
p = os.path.join(dir_path, f"model_info_{i}.json")
try:
d = json.load(open(p, "r", encoding="utf-8"))
except Exception:
continue
if "template_id" not in d:
d["template_id"] = int(template_id_value)
with open(p, "w", encoding="utf-8") as fw:
json.dump(d, fw, ensure_ascii=False, indent=4)
def _is_superglue_template_dir(dir_path):
"""粗略校验是否为superglue模板目录"""
if not os.path.isdir(dir_path):
return False
js = _list_indices_by_json(dir_path)
if not js:
return False
# 至少有 temp_0.png 或 depth_model_0.tiff
has_temp = any(os.path.exists(os.path.join(dir_path, f"temp_{i}.png")) for i in js)
has_depth = any(os.path.exists(os.path.join(dir_path, f"depth_model_{i}.tiff")) for i in js)
return has_temp or has_depth
def _copy_one_template_block(src_dir, src_idx, dst_dir, dst_idx, template_id):
"""将一套 index 对应的模板文件从src复制到dst,并改名为新索引;同时写入 template_id"""
patterns = [
("temp_{i}.png", "temp_{j}.png"),
("depth_model_{i}.tiff", "depth_model_{j}.tiff"),
("model_{i}.ply", "model_{j}.ply"),
("model_{i}_downsampled.ply", "model_{j}_downsampled.ply"),
("keypoints_{i}.ply", "keypoints_{j}.ply"),
("temp_vis_{i}.jpg", "temp_vis_{j}.jpg"),
("depth_mask_uint8_{i}.png", "depth_mask_uint8_{j}.png"),
]
os.makedirs(dst_dir, exist_ok=True)
for src_pat, dst_pat in patterns:
s = os.path.join(src_dir, src_pat.format(i=src_idx))
d = os.path.join(dst_dir, dst_pat.format(j=dst_idx))
if os.path.exists(s):
shutil.copy(s, d)
# 处理 model_info
src_info = os.path.join(src_dir, f"model_info_{src_idx}.json")
dst_info = os.path.join(dst_dir, f"model_info_{dst_idx}.json")
if os.path.exists(src_info):
try:
data = json.load(open(src_info, "r", encoding="utf-8"))
except Exception:
data = {}
data["template_id"] = int(template_id)
with open(dst_info, "w", encoding="utf-8") as f:
json.dump(data, f, ensure_ascii=False, indent=4)
def merge_superglue_folders(target_dir, source_dirs):
"""功能1:合并多个superglue模板目录到target_dir,顺序追加并写入template_id"""
if not _is_superglue_template_dir(target_dir):
raise RuntimeError(f"目标路径不是有效的superglue模板目录: {target_dir}")
for s in source_dirs:
if not _is_superglue_template_dir(s):
raise RuntimeError(f"来源路径不是有效的superglue模板目录: {s}")
# 目标已有模板写入 template_id=0(若缺失)
_ensure_template_id_in_dir(target_dir, 0)
next_idx = _get_next_index(target_dir)
# 从1开始给后续目录分配 template_id
for folder_tid, src in enumerate(source_dirs, start=1):
src_indices = _list_indices_by_json(src)
for i in src_indices:
_copy_one_template_block(src, i, target_dir, next_idx, folder_tid)
next_idx += 1
print(f"合并完成,输出在: {target_dir}")
def _detect_group_dirs(root_dir):
"""功能2:检测 root_dir 下是否存在多组数据(按子目录作为一组)"""
group_dirs = []
for name in sorted(os.listdir(root_dir)):
p = os.path.join(root_dir, name)
if not os.path.isdir(p):
continue
# 需要同时包含 ply + 任一rgb + tiff + json(camera_param)
has_ply = any(fn.endswith(".ply") for fn in os.listdir(p))
has_img = any(fn.lower().endswith((".png", ".jpg", ".jpeg", ".bmp")) for fn in os.listdir(p))
has_tiff = any(fn.lower().endswith(".tiff") for fn in os.listdir(p))
has_json = False
for fn in os.listdir(p):
if fn.lower().endswith(".json"):
try:
d = json.load(open(os.path.join(p, fn), "r", encoding="utf-8"))
if "camera_param" in d:
has_json = True
break
except Exception:
pass
if has_ply and has_img and has_tiff and has_json:
group_dirs.append(p)
return group_dirs
def search_depth_values(indices, depth_mask, search_radius=3):
"""
在附近搜索一个非零的深度值并填充到深度掩码中
参数:
- indices: 关键点的索引数组,形状为 (N, 2)
- depth_mask: 初始深度掩码,形状为 (H, W)
- search_radius: 搜索半径,默认为 3
返回:
- depth_values: 每个关键点搜索到的深度值
"""
depth_values = np.full(indices.shape[0], -1, dtype=np.float32)
for idx, (x, y) in enumerate(indices):
if depth_mask[y, x] == 0:
xmin, xmax = max(0, x - search_radius), min(depth_mask.shape[1], x + search_radius + 1)
ymin, ymax = max(0, y - search_radius), min(depth_mask.shape[0], y + search_radius + 1)
search_area = depth_mask[ymin:ymax, xmin:xmax]
non_zero = search_area[search_area != 0]
if non_zero.size > 0:
depth_values[idx] = non_zero[0]
else:
depth_values[idx] = depth_mask[y, x]
return depth_values
def rotate_pcd(pcd, angle, original_center=None):
"""旋转点云"""
if original_center is None:
original_center = pcd.get_center()
t_1 = np.array(
[[1, 0, 0, -original_center[0]], [0, 1, 0, -original_center[1]], [0, 0, 1, -original_center[2]], [0, 0, 0, 1]]
)
theta = np.radians(angle)
cos_theta = np.cos(theta)
sin_theta = np.sin(theta)
t_2 = np.array([[cos_theta, -sin_theta, 0, 0], [sin_theta, cos_theta, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1]])
t_3 = np.array(
[[1, 0, 0, original_center[0]], [0, 1, 0, original_center[1]], [0, 0, 1, original_center[2]], [0, 0, 0, 1]]
)
transform_all = np.dot(t_3, np.dot(t_2, t_1))
pcd_final = deepcopy(pcd).transform(transform_all)
return pcd_final, transform_all
def project_pcd_to_rgb(
input_dir,
output_dir,
angle,
auto_scale,
mask_kernel_size,
edge_kernel_size,
index=1,
use_edge=False,
background_color=0,
foreground_color=255,
template_id=None,
base_index=0,
):
"""将点云投影到RGB图像(用于SuperGlue模板)"""
os.makedirs(output_dir, exist_ok=True)
# 使用 base_index 读取基准模板
cam_k_path = os.path.join(input_dir, f"model_info_{base_index}.json")
cam_k = json.load(open(cam_k_path))
cam_k = cam_k["camera_param"]
cam_k = np.asarray(cam_k).reshape(3, 3).astype(np.float32)
pcd_path = os.path.join(input_dir, f"model_{base_index}.ply")
pcd = o3d.io.read_point_cloud(pcd_path)
temp_img_path = os.path.join(input_dir, f"temp_{base_index}.png")
temp_img = cv2.imread(temp_img_path)
project_img = np.zeros(temp_img.shape, dtype=np.uint8)
project_depth = np.zeros(temp_img.shape[:2], dtype=np.float32)
aabb = pcd.get_axis_aligned_bounding_box()
aabb_center = (aabb.get_min_bound() + aabb.get_max_bound()) / 2
pcd_transformed, transform_all = rotate_pcd(pcd, angle, aabb_center)
o3d.io.write_point_cloud(os.path.join(output_dir, f"model_{index}.ply"), pcd_transformed)
f_json_data = open(os.path.join(output_dir, f"model_info_{index}.json"), "w+")
json_saved_data = {}
json_saved_data["camera_param"] = cam_k.tolist()
json_saved_data["transform"] = transform_all.tolist()
json_saved_data["angle"] = angle
json_saved_data["mask_kernel_size"] = mask_kernel_size
json_saved_data["edge_kernel_size"] = edge_kernel_size
json_saved_data["use_edge"] = use_edge
if template_id is not None:
json_saved_data["template_id"] = int(template_id)
json.dump(json_saved_data, f_json_data, indent=4)
f_json_data.close()
rvec = np.array([0.0, 0.0, 0.0])
tvec = np.array([0.0, 0.0, 0.0])
distortion_zeros = np.zeros((5, 1), dtype=np.float32)
pcd_np = np.array(pcd_transformed.points)
points_2d, _ = cv2.projectPoints(pcd_np, rvec, tvec, cam_k, distortion_zeros)
points_2d = points_2d.squeeze(1).reshape(-1, 2)
color_bgr = np.asarray(pcd_transformed.colors)[:, ::-1] * 255
for i, pt in enumerate(points_2d):
project_img[round(pt[1]), round(pt[0]), :] = color_bgr[i]
project_depth[round(pt[1]), round(pt[0])] = pcd_np[i][2]
mask = np.any(project_img != 0, axis=2)
project_img = cv2.cvtColor(project_img, cv2.COLOR_BGR2GRAY)
project_img[project_img == 0] = background_color
if use_edge:
project_img[mask] = foreground_color
project_img = cv2.morphologyEx(
project_img, cv2.MORPH_CLOSE, np.ones((mask_kernel_size, mask_kernel_size), np.uint8)
)
project_img = cv2.cvtColor(project_img, cv2.COLOR_GRAY2BGR)
cv2.imwrite(os.path.join(output_dir, f"depth_model_{index}.tiff"), project_depth)
cv2.imwrite(os.path.join(output_dir, f"temp_{index}.png"), project_img)
concat_img = cv2.hconcat([project_img, temp_img])
cv2.imwrite(os.path.join(output_dir, f"project_img_concat_{index}.jpg"), concat_img)
# 输出 keypoints.ply
if SuperGlueMatcher is not None and torch is not None:
device = 'cuda' if torch.cuda.is_available() else 'cpu'
matching = SuperGlueMatcher().eval().to(device)
matching.set_edge_mode(use_edge)
matching.set_edge_kernel_size(edge_kernel_size)
matching.register(project_img, resolution=(project_img.shape[0], project_img.shape[1]))
keypoints_model_2d = matching.temp_data['keypoints0'][0].cpu().numpy().astype(np.int32)
Keypoint_3D_model = np.zeros((len(keypoints_model_2d), 3), dtype=np.float32)
depth_values = search_depth_values(keypoints_model_2d, project_depth, 5)
def get_rainbow_color(index, total_points):
hsv = np.array([[[int(255 * index / total_points), 255, 255]]], dtype=np.uint8)
bgr = cv2.cvtColor(hsv, cv2.COLOR_HSV2BGR)[0][0]
return tuple(map(int, bgr))
temp_vis = deepcopy(project_img)
for i in range(len(keypoints_model_2d)):
x, y = keypoints_model_2d[i]
z = depth_values[i]
if z == -1:
Keypoint_3D_model[i] = [0, 0, 0]
continue
Keypoint_3D_model[i] = np.linalg.inv(cam_k) @ (np.array([x, y, 1]) * z)
color = get_rainbow_color(i, len(keypoints_model_2d))
center = (int(round(x)), int(round(y)))
cv2.circle(temp_vis, center, radius=1, color=color, thickness=-1)
cv2.imwrite(os.path.join(output_dir, f"temp_vis_{index}.jpg"), temp_vis)
rotated_model_kpts = o3d.geometry.PointCloud()
rotated_model_kpts.points = o3d.utility.Vector3dVector(Keypoint_3D_model)
rotated_model_kpts.colors = o3d.utility.Vector3dVector([[0, 1, 0] for _ in range(Keypoint_3D_model.shape[0])])
o3d.io.write_point_cloud(os.path.join(output_dir, f"keypoints_{index}.ply"), rotated_model_kpts)
if torch.cuda.is_available():
torch.cuda.empty_cache()
return True
def generate_ism_template(args):
"""生成ISM模板(第一个脚本的功能)"""
data_dir = args.data_dir
rotation_range = args.rot_range
rotation_interval = args.rot_interval
rotation_range_xy = args.rot_range_xy
depth_range = args.depth_range
depth_interval = args.depth_interval
template_path = None
ply_path = None
pcd = None
# 查找所需文件
for f in os.listdir(data_dir):
template_path = os.path.join(data_dir, "template")
os.makedirs(template_path, exist_ok=True)
f_path = os.path.join(data_dir, f)
if f.endswith('.ply'):
ply_path = f_path
pcd = o3d.io.read_point_cloud(ply_path)
pcd.paint_uniform_color([0, 1.0, 0])
o3d.io.write_point_cloud(f"{template_path}/model.ply", pcd)
if f.endswith('.png') or f.endswith('.jpg') or f.endswith('.bmp'):
rgb = cv2.imread(f_path)
if len(rgb.shape) == 2:
rgb = cv2.cvtColor(rgb, cv2.COLOR_GRAY2RGB)
h, w, _ = rgb.shape
if f.endswith('.tiff'):
depth = cv2.imread(f_path, -1)
if f.endswith('.json'):
resource_manager = FileOperation.load_json(f_path)
cam_k = resource_manager["camera_param"]
cam_k = np.asarray(cam_k).reshape(3, 3).astype(np.float32)
# 检查文件是否存在
if ply_path is None:
raise ValueError(f"错误! 未在 {data_dir} 路径下找到 点云 文件, 支持后缀为 ply.\n")
if pcd is None:
raise ValueError(f"错误! {ply_path} 路径的点云文件读取失败.\n")
try:
rgb
except NameError:
raise ValueError(f"错误! 未能在 {data_dir} 路径下正确读取 彩色图像 文件, 支持后缀为 png/jpg/bmp.\n")
try:
depth
except NameError:
raise ValueError(f"错误! 未能在 {data_dir} 路径下正确读取 深度图像 文件, 支持后缀为 tiff.\n")
try:
cam_k
except NameError:
raise ValueError(f"错误! 未在 {data_dir} 路径下找到 配置参数 文件, 支持后缀为 json.\n")
# ISM prompt
pcd_center = pcd.get_center()
angles_z = range(-rotation_range, rotation_range, rotation_interval)
mask = generate_mask_from_points(np.array(pcd.points), cam_k, h, w, auto_scale=1)
# ISM prompt
mask_uint8 = cv2.normalize(mask, None, 0, 255, cv2.NORM_MINMAX).astype(np.uint8)
rgb_mask = deepcopy(rgb)
rgb_mask[mask == 0] = 0
# 将 mask 与 rgb 的前景质心平移到图像中心
coords = np.column_stack(np.where(mask_uint8 > 0))
if coords.size > 0:
cy, cx = coords.mean(axis=0)
center_x = w / 2.0
center_y = h / 2.0
dx = center_x - cx
dy = center_y - cy
T = np.float32([[1, 0, dx], [0, 1, dy]])
rgb_mask = cv2.warpAffine(
rgb_mask, T, (w, h), flags=cv2.INTER_LINEAR, borderMode=cv2.BORDER_CONSTANT, borderValue=(0, 0, 0)
)
mask_uint8 = cv2.warpAffine(
mask_uint8, T, (w, h), flags=cv2.INTER_NEAREST, borderMode=cv2.BORDER_CONSTANT, borderValue=0
)
if template_path is not None:
center = (w / 2, h / 2)
new_w = int(math.sqrt(w**2 + h**2))
new_h = new_w
main_i = 0
for angle in tqdm(angles_z, desc="生成ISM旋转模板"):
M = cv2.getRotationMatrix2D(center, angle, 1.0)
M[0, 2] += (new_w - w) / 2
M[1, 2] += (new_h - h) / 2
# 生成rgb
rotated_rgb = cv2.warpAffine(
rgb_mask, M, (new_w, new_h), borderMode=cv2.BORDER_REPLICATE, borderValue=(127, 127, 127)
)
save_name_rgb = f"rgb_{main_i}.png"
save_path_rgb = os.path.join(template_path, save_name_rgb)
cv2.imwrite(save_path_rgb, rotated_rgb)
# 生成mask
rotated_mask = cv2.warpAffine(
mask_uint8, M, (new_w, new_h), borderMode=cv2.BORDER_REPLICATE, borderValue=(127, 127, 127)
)
save_name_mask = f"mask_{main_i}.png"
save_path_mask = os.path.join(template_path, save_name_mask)
cv2.imwrite(save_path_mask, rotated_mask)
main_i += 1
angles_xy = range(-rotation_range_xy, rotation_range_xy + 1, rotation_range_xy) if rotation_range_xy > 0 else [0]
# 设置深度变化范围 (mm to meters)
if depth_interval > 0:
depth_shifts = np.arange(-depth_range, depth_range + depth_interval, depth_interval) / 1000.0
else:
depth_shifts = [0]
geometric_features_model = []
variants_meta = []
i = 0
for angle_z in tqdm(angles_z, desc="生成ISM几何特征"):
for angle_x in angles_xy:
for angle_y in angles_xy:
pcd_rotated = deepcopy(pcd)
# Z轴旋转
Rz = pcd_rotated.get_rotation_matrix_from_xyz((0, 0, np.deg2rad(angle_z)))
pcd_rotated.rotate(Rz, center=pcd_center)
# X轴小角度旋转
if angle_x != 0:
Rx = pcd_rotated.get_rotation_matrix_from_xyz((np.deg2rad(angle_x), 0, 0))
pcd_rotated.rotate(Rx, center=pcd_center)
# Y轴小角度旋转
if angle_y != 0:
Ry = pcd_rotated.get_rotation_matrix_from_xyz((0, np.deg2rad(angle_y), 0))
pcd_rotated.rotate(Ry, center=pcd_center)
for depth_shift in depth_shifts:
pcd_final = deepcopy(pcd_rotated)
if depth_shift != 0:
pcd_final.translate((0, 0, depth_shift))
# 从旋转和平移后的点云生成mask
mask = generate_mask_from_points(np.array(pcd_final.points), cam_k, h, w, auto_scale=2)
mask_uint8 = cv2.normalize(mask, None, 0, 255, cv2.NORM_MINMAX).astype(np.uint8)
obb_ratio = get_ratio_from_mask(mask_uint8)
mask_area = int(np.count_nonzero(mask_uint8 > 0))
if obb_ratio > 0 and mask_area > 0:
geometric_features_model.append([float(obb_ratio), float(mask_area)])
variants_meta.append(
{
"idx": i,
"angle_z": angle_z,
"angle_x": angle_x,
"angle_y": angle_y,
"depth_shift_m": float(depth_shift),
"obb_ratio": float(obb_ratio),
"mask_area": mask_area,
}
)
i += 1
meta = {"geometric_features_model": geometric_features_model, "variants": variants_meta, "num_variants": i}
with open(os.path.join(template_path, "prompt_meta.json"), "w", encoding="utf-8") as f:
json.dump(meta, f, ensure_ascii=False, indent=2)
print(f"完成ISM模板生成: 元数据写入 {template_path}/prompt_meta.json")
return template_path
def generate_superglue_template(args, start_index=0, template_id=None):
"""生成SuperGlue模板(第二个脚本的功能)"""
if SuperGlueMatcher is None:
raise ImportError("无法导入SuperGlueMatcher,请确保glia版本正确")
if torch is None:
raise ImportError("无法导入torch,请安装PyTorch")
data_dir = args.data_dir
auto_scale = args.auto_scale
mask_kernel_size = args.mask_kernel_size
edge_kernel_size = args.edge_kernel_size
background_color = args.background_color
foreground_color = args.foreground_color
voxel_size = args.down_sample
ply_path = None
pcd = None
# 查找所需文件
for f in os.listdir(data_dir):
f_path = os.path.join(data_dir, f)
if f.endswith('.ply'):
ply_path = f_path
pcd = o3d.io.read_point_cloud(ply_path)
if f.endswith('.png') or f.endswith('.jpg') or f.endswith('.bmp'):
rgb = cv2.imread(f_path)
h, w = rgb.shape[0], rgb.shape[1]
if f.endswith('.tiff'):
depth = cv2.imread(f_path, -1)
if f.endswith('.json'):
resource_manager = json.load(open(f_path))
cam_k = resource_manager["camera_param"]
cam_k = np.asarray(cam_k).reshape(3, 3).astype(np.float32)
# 检查文件是否存在
if ply_path is None:
raise ValueError(f"错误! 未在 {data_dir} 路径下找到 点云 文件, 支持后缀为 ply.\n")
if pcd is None:
raise ValueError(f"错误! {ply_path} 路径的点云文件读取失败.\n")
try:
rgb
except NameError:
raise ValueError(f"错误! 未能在 {data_dir} 路径下正确读取 彩色图像 文件, 支持后缀为 png/jpg/bmp.\n")
try:
depth
except NameError:
raise ValueError(f"错误! 未能在 {data_dir} 路径下正确读取 深度图像 文件, 支持后缀为 tiff.\n")
try:
cam_k
except NameError:
raise ValueError(f"错误! 未在 {data_dir} 路径下找到 配置参数 文件, 支持后缀为 json.\n")
mask = generate_mask_from_points(
np.array(pcd.points), cam_k, h, w, kernel_size=mask_kernel_size, auto_scale=auto_scale
)
mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, np.ones((mask_kernel_size, mask_kernel_size), np.uint8))
if rgb.shape[-1] == 3:
rgb = cv2.cvtColor(rgb, cv2.COLOR_BGR2GRAY)
rgb_mask = deepcopy(rgb)
rgb_mask[mask == 0] = background_color
if args.use_edge:
rgb_mask[mask != 0] = foreground_color
depth_mask = deepcopy(depth)
# 裁剪图像
expand_pixels = 10
coords = cv2.findNonZero(mask)
if coords is None or len(coords) == 0:
raise ValueError("mask 中没有非零点,无法进行裁剪")
x_bbox, y_bbox, w_bbox, h_bbox = cv2.boundingRect(coords)
x_expanded = max(0, x_bbox - expand_pixels)
y_expanded = max(0, y_bbox - expand_pixels)
w_expanded = min(w - x_expanded, w_bbox + 2 * expand_pixels)
h_expanded = min(h - y_expanded, h_bbox + 2 * expand_pixels)
rgb_mask = rgb_mask[y_expanded : y_expanded + h_expanded, x_expanded : x_expanded + w_expanded]
depth_mask = depth_mask[y_expanded : y_expanded + h_expanded, x_expanded : x_expanded + w_expanded]
diameter = math.ceil(np.sqrt(w_expanded * w_expanded + h_expanded * h_expanded))
x_pad = (diameter - w_expanded) // 2
y_pad = (diameter - h_expanded) // 2
rgb_mask_paded = np.zeros([diameter, diameter], dtype=np.uint8)
depth_mask_paded = np.zeros([diameter, diameter], dtype=np.float32)
rgb_mask_paded[y_pad : y_pad + h_expanded, x_pad : x_pad + w_expanded] = rgb_mask
depth_mask_paded[y_pad : y_pad + h_expanded, x_pad : x_pad + w_expanded] = depth_mask
rgb_mask = deepcopy(rgb_mask_paded)
depth_mask = deepcopy(depth_mask_paded)
# 更新相机内参矩阵
cam_k[0, 2] = cam_k[0, 2] - x_expanded + x_pad
cam_k[1, 2] = cam_k[1, 2] - y_expanded + y_pad
# 统一输出目录:优先 args.output_superglue_dir,否则 data_dir/superglue
superglue_path = getattr(args, "output_superglue_dir", None) or os.path.join(data_dir, "superglue")
os.makedirs(superglue_path, exist_ok=True)
index = int(start_index)
base_index = index
use_depth = args.use_depth
use_edge = args.use_edge
# 输出 model.ply
o3d.io.write_point_cloud(f"{superglue_path}/model_{index}.ply", pcd)
if voxel_size is not None:
pcd_down_sampled = deepcopy(pcd).voxel_down_sample(voxel_size=voxel_size)
o3d.io.write_point_cloud(f"{superglue_path}/model_{index}_downsampled.ply", pcd_down_sampled)
# 注册模版输出 temp.png keypoints.ply
temp = deepcopy(rgb_mask)
if use_depth:
temp = deepcopy(depth_mask)
temp = cv2.normalize(temp, None, 0, 255, cv2.NORM_MINMAX).astype(np.uint8)
cv2.imwrite(f"{superglue_path}/depth_mask_uint8_{index}.png", temp)
cv2.imwrite(f"{superglue_path}/depth_model_{index}.tiff", depth_mask)
cv2.imwrite(f"{superglue_path}/temp_{index}.png", temp)
# 输出 keypoints.ply
device = 'cuda' if torch.cuda.is_available() else 'cpu'
matching = SuperGlueMatcher().eval().to(device)
if use_edge:
matching.set_edge_mode(True)
matching.set_edge_kernel_size(edge_kernel_size)
matching.register(temp, resolution=(temp.shape[0], temp.shape[1]))
keypoints_model_2d = matching.temp_data['keypoints0'][0].cpu().numpy().astype(np.int32)
Keypoint_3D_model = np.zeros((len(keypoints_model_2d), 3), dtype=np.float32)
depth_values = search_depth_values(keypoints_model_2d, depth_mask, 5)
temp_vis = deepcopy(rgb_mask)
temp_vis = cv2.cvtColor(temp_vis, cv2.COLOR_GRAY2BGR)
def get_rainbow_color(index, total_points):
hsv = np.array([[[int(255 * index / total_points), 255, 255]]], dtype=np.uint8)
bgr = cv2.cvtColor(hsv, cv2.COLOR_HSV2BGR)[0][0]
return tuple(map(int, bgr))
for i in range(len(keypoints_model_2d)):
x, y = keypoints_model_2d[i]
z = depth_values[i]
if z == -1:
Keypoint_3D_model[i] = [0, 0, 0]
continue
Keypoint_3D_model[i] = np.linalg.inv(cam_k) @ (np.array([x, y, 1]) * z)
color = get_rainbow_color(i, len(keypoints_model_2d))
center = (int(round(x)), int(round(y)))
cv2.circle(temp_vis, center, radius=1, color=color, thickness=-1)
cv2.imwrite(f"{superglue_path}/temp_vis_{index}.jpg", temp_vis)
pcd_model_keypoints = o3d.geometry.PointCloud()
pcd_model_keypoints.points = o3d.utility.Vector3dVector(Keypoint_3D_model)
pcd_model_keypoints.colors = o3d.utility.Vector3dVector([[0, 1, 0] for _ in range(Keypoint_3D_model.shape[0])])
o3d.io.write_point_cloud(f"{superglue_path}/keypoints_{index}.ply", pcd_model_keypoints)
f_json_data = open(os.path.join(superglue_path, f"model_info_{index}.json"), "w+")
json_saved_data = {}
json_saved_data["camera_param"] = cam_k.tolist()
json_saved_data["transform"] = np.eye(4).tolist()
json_saved_data["angle"] = 0.0
json_saved_data["mask_kernel_size"] = mask_kernel_size
json_saved_data["edge_kernel_size"] = edge_kernel_size
json_saved_data["use_edge"] = args.use_edge
if template_id is not None:
json_saved_data["template_id"] = int(template_id)
json.dump(json_saved_data, f_json_data, indent=4)
f_json_data.close()
if args.multi_temp:
for k, angle in enumerate(args.angle):
project_pcd_to_rgb(
superglue_path,
superglue_path,
angle,
args.auto_scale,
mask_kernel_size,
edge_kernel_size,
index=base_index + 1 + k,
use_edge=args.use_edge,
background_color=background_color,
foreground_color=foreground_color,
template_id=template_id,
base_index=base_index,
)
# 组别的第0个模版点云另存为 multi_model_{template_id}.ply(若启用多模版)
if template_id is not None:
src = os.path.join(superglue_path, f"model_{base_index}.ply")
dst = os.path.join(superglue_path, f"multi_model_{template_id}.ply")
if os.path.exists(src):
shutil.copy(src, dst)
if voxel_size is not None:
# 生成 multi_model_{template_id}_downsampled.ply
pcd0 = o3d.io.read_point_cloud(src)
pcd0_ds = deepcopy(pcd0).voxel_down_sample(voxel_size=voxel_size)
o3d.io.write_point_cloud(
os.path.join(superglue_path, f"multi_model_{template_id}_downsampled.ply"), pcd0_ds
)
if torch.cuda.is_available():
torch.cuda.empty_cache()
print(f"完成SuperGlue模板生成: 文件保存在 {superglue_path}")
# 返回下一个可用索引(使用 base_index 计算)
end_next_index = base_index + (len(args.angle) + 1 if args.multi_temp else 1)
return superglue_path, end_next_index
def main():
parser = argparse.ArgumentParser(description="融合脚本:生成ISM和/或SuperGlue模板")
parser.add_argument(
"--output_superglue_dir",
type=str,
default=None,
help="指定SuperGlue模板统一输出目录;不指定则写入 data_dir/superglue",
)
# 基础参数
parser.add_argument("data_dir", help="输入数据目录")
parser.add_argument(
"--type",
choices=["ism", "superglue", "both"],
default="both",
help="选择生成的模板类型: ism, superglue, 或 both",
)
parser.add_argument("--file_transfer", action="store_true", default=False, help="执行文件复制操作")
parser.add_argument("--output_dir", help="文件复制时的输出目录")
# ISM参数
parser.add_argument("--rot_range", type=int, default=60, help="Z-axis rotation range (degrees)")
parser.add_argument("--rot_interval", type=int, default=10, help="Z-axis rotation interval (degrees)")
parser.add_argument("--rot_range_xy", type=int, default=5, help="XY small rotation range (degrees)")
parser.add_argument("--depth_range", type=int, default=200, help="Z translation range (mm)")
parser.add_argument("--depth_interval", type=int, default=200, help="Z translation interval (mm)")
# SuperGlue参数
parser.add_argument("--auto_scale", type=float, default=0.1, help="点云到掩码的缩放比例")
parser.add_argument("--use_depth", action="store_true", default=False, help="使用深度图")
parser.add_argument("--use_edge", action="store_true", default=False, help="使用边缘模式")
parser.add_argument("--edge_kernel_size", type=int, default=3, help="边缘核大小")
parser.add_argument("--mask_kernel_size", type=int, default=5, help="形态学操作核大小")
parser.add_argument("--multi_temp", action="store_true", default=False, help="生成多角度模板")
parser.add_argument("--angle", type=float, nargs="+", default=[90, 180, 270], help="旋转角度")
parser.add_argument("--down_sample", type=float, default=None, help="点云下采样体素大小(米), 设置为None禁用")
parser.add_argument("--background_color", type=int, default=0, help="边缘模式背景颜色")
parser.add_argument("--foreground_color", type=int, default=255, help="边缘模式前景颜色")
# 新增:多模版模式
parser.add_argument("--multi_template", action="store_true", default=False, help="启用SuperGlue多模版模式")
parser.add_argument(
"--superglue_dirs",
nargs="+",
default=[],
help="多个superglue模板目录(功能1:合并多个已生成的superglue模板目录)",
)
args = parser.parse_args()
# 执行文件复制
if args.file_transfer:
if not args.output_dir:
raise ValueError("执行文件复制时需要指定 --output_dir 参数")
print("开始文件复制...")
file_transfer(args)
print(f"文件复制完成,文件保存在: {args.output_dir}")
return
# 生成ISM
arg_ism = deepcopy(args)
if args.type in ["ism", "both"]:
def _first_subdir(root: str) -> str:
try:
subs = sorted([e.path for e in os.scandir(root) if e.is_dir()])
return subs[0] if subs else root
except Exception:
return root
if getattr(args, "multi_template", False):
arg_ism.data_dir = _first_subdir(arg_ism.data_dir)
print("开始生成ISM模板...")
try:
ism_path = generate_ism_template(arg_ism)
print(f"ISM模板生成完成,保存在: {ism_path}")
except Exception as e:
print(f"生成ISM模板时出错: {e}")
if arg_ism.type == "ism":
raise
# 生成/处理SuperGlue
if args.type in ["superglue", "both"]:
print("开始生成/处理SuperGlue模板...")
try:
if args.multi_template:
# 功能1:合并多个superglue目录
if len(args.superglue_dirs) >= 2:
target_dir = args.superglue_dirs[0]
source_dirs = args.superglue_dirs[1:]
merge_superglue_folders(target_dir, source_dirs)
print(f"SuperGlue多目录合并完成,输出: {target_dir}")
else:
# 功能2:单目录多组 -> 统一写入一个目录,并从末尾连续编号
group_dirs = _detect_group_dirs(args.data_dir)
if len(group_dirs) >= 1:
out_dir = args.output_superglue_dir or os.path.join(args.data_dir, "superglue")
os.makedirs(out_dir, exist_ok=True)
start_idx = _get_next_index(out_dir)
next_idx = start_idx
for gid, gdir in enumerate(group_dirs):
args_one = deepcopy(args)
args_one.data_dir = gdir
args_one.output_superglue_dir = out_dir # 统一输出目录
_, next_idx = generate_superglue_template(args_one, start_index=next_idx, template_id=gid)
print(f"SuperGlue多组生成完成,输出: {out_dir}")
else:
# 常规单组生成,但写入 template_id=0,也支持统一输出目录
out_dir = args.output_superglue_dir or os.path.join(args.data_dir, "superglue")
os.makedirs(out_dir, exist_ok=True)
_, _ = generate_superglue_template(
args,
start_index=_get_next_index(out_dir),
template_id=0,
)
else:
# 原有逻辑:单组生成(不写template_id字段)
_ = generate_superglue_template(args) # 返回值未使用
print("SuperGlue模板生成完成")
except Exception as e:
print(f"生成/处理SuperGlue模板时出错: {e}")
if args.type == "superglue":
raise
if args.type == "both":
print("所有模板生成完成!")
print(f"ISM模板保存在: {os.path.join(arg_ism.data_dir, 'template')}")
print(f"SuperGlue模板保存在: {os.path.join(args.data_dir, 'superglue')}")
if __name__ == "__main__":
main()(3) Use the template generation script to copy historical data
- In the download directory of the template generation script, right-click on an empty area to open the context menu, then click
Open in Terminalin the context menu to open theWindows PowerShellterminal, as shown below.


- In the terminal, run the
conda activate pickwiz_py39command to enter the pickwiz_py39 environment, as shown below.

- After the terminal enters the pickwiz_py39 environment, run the following command. You can modify the command according to the template generation script name, the path of the historical data timestamp to copy, and the output save path.
python generate_prompt_double_ism_feature.py #Call the Python script; modify based on the template generation script name
--data_dir "C:\Users\dex\kuawei_data\PickLight\20240617150557809" #Path of the historical data timestamp to copy; modify based on the historical data timestamp
--file_transfer --output_dir #File transfer and output command
"C:\Users\dex\Documents\PickWiz\new_project_22\superglue" #Output file save path; you can modify the save path
--type feat #Indicates generating feature matching templateExample: The template generation script name is "generate_prompt_superglue.py";
The timestamp path of the historical data to copy is "D:\Pickwiz\new_project\data\PickLight\20250411144909289";
The output file save path is "C:\Users\dex\Documents\PickWiz\new_project_1\data\PickLight".
python generate_prompt_double_ism_feature.py #Call the Python script“generate_prompt_superglue.py”
--data_dir "D:\Pickwiz\new_project\data\PickLight\20250411144909289" #Path of the historical data timestamp to copy
--file_transfer --output_dir #File transfer and output command
"C:\Users\dex\Documents\PickWiz\new_project_1\data\PickLight" #Output file save path
--type feat #Indicates generating feature matching templateModify the script name, historical data timestamp path, and output file save path when executing the command, as shown below.

(4) After executing the command, 4 files are generated in the save path: the scene's 2D image, the scene's depth map, the scene point cloud, and the camera Intrinsic Parameter

- Crop the Point Cloud
Open the scene point cloud .ply file in the meshlab software, crop away the noise from the scene point cloud until only the workpiece point cloud remains, then click to overwrite and save.
Be careful to retain the complete workpiece point cloud when cropping the point cloud.
| Before Cropping | After Cropping | Overwrite Save |
|---|---|---|
![]() | ![]() | ![]() |
Generate Point Cloud Template
In the terminal, run the
conda activate pickwiz_py39command to enter the pickwiz_py39 environment.

- In the pickwiz_py39 environment, run the following command. You can modify it based on workpiece characteristics.
The script provides the --use_edge and --multi_temp parameters for adjusting the template generation method. --use_edge indicates using edge detection; --multi_temp indicates generating multi-directional templates for scenes where the incoming workpiece direction is not fixed. These two parameters are not added by default, meaning multi-template and edge detection are disabled, and the point cloud template is generated using 2D images.
When the workpiece surface texture is not obvious, add the --use_edge parameter to enable edge detection to enhance image features and generate an edge-enhanced point cloud template.
python generate_prompt_double_ism_feature.py #Call the Python script
--data_dir "C:\Users\dex\Documents\lixin\unify_infer\superglue_model_gen\superglue" #Enter the save path from step 3 above
--type feat #Indicates generating feature matching templateExample: When the actual scene lighting conditions are unstable, the workpiece surface texture is not obvious, or the geometry is complex, add the --use_edge parameter. The script will first perform edge detection on the 2D image, replacing the original 2D image for matching. The matching will focus on the geometric edge features of the workpiece to generate a point cloud template.
python generate_prompt_double_ism_feature.py #Call the Python script
--data_dir "C:\Users\dex\Documents\PickWiz\new_project_1\data\PickLight" --use_edge #Enter the save path from step 3 above, and add the --use_edge edge detection parameter
--type feat #Indicates generating feature matching template
- After executing the command, a template folder "feature matching" is generated in the save path.

Check whether the following 4 files exist in that folder.

(Note: in grayscale mode the template image is a grayscale image; in edge mode it is a binary image)
| Grayscale Mode | ![]() |
|---|---|
| Edge Mode | ![]() |
- If the incoming workpieces arrive from multiple directions, an example of enabling the multi-template method is shown below, where
--anglespecifies the rotation angles relative to the main template. After running, templates rotated by several angles relative to the main template will be generated, as shown below
python generate_prompt_double_ism_feature.py --data_dir superglue-compressor --type feat --use_edge --multi_temp --angle 45 90 180 225 270as shown below

Import the Point Cloud Template and Select the Template File Path
Import the
model.plyfile from the "feature matching" folder in the save path into thePoint Cloud Filefield in the workpiece interface as the point cloud template**.**

- 在
模板文件路径选择保存路径下“feature matching”文件夹的路径(如C:\Users\dex\Documents\PickWiz\new_project_1\data\PickLight\feature matching)。

- Template Generation Script Parameter Description
| Parameter Name | Parameter Description | Recommended Value | Brief Description |
|---|---|---|---|
| data_dir | Prior folder path | \ | Same as the file path used by the previously generated data script When copying paths in Windows, single backslashes "\" may cause path errors; replace single backslashes with double backslashes or use forward slashes "/" as recommended |
| file_transfer | Whether to copy files | \ | \ |
| output_dir | Only takes effect when file_transfer is active | \ | \ |
| scale | Scale factor for generating Mask from point cloud projection | 0.1 | Scale factor for generating Mask from point cloud projection |
| use_edge | Whether to enable edge mode; if not enabled, grayscale mode is used | \ | When workpieces are stacked in multiple layers, the contours of lower-layer objects and upper-layer objects are difficult for the model to distinguish; grayscale mode should be used |
| use_depth | Whether to use a depth map template | \ | When enabled, the template image is generated by converting the depth map to a color image. Switch to depth map mode when the object surface has no obvious texture and the point cloud/depth map quality is good |
| multi_temp | Whether to generate multi-directional templates | \ | Default is false, meaning only single-direction templates are generated by default |
| mask_kernel_size | Closing operation kernel size when projecting point cloud to Mask | 5 | Use the default value |
| edge_kernel_size | Kernel size for edge feature extraction | 3 | ![]() Larger values result in feature maps closer to the interior |
| angle | When using multi-templates, the rotation angle of sub-templates relative to the main template | \ | Angle input format; refer to the example command |
| down_sample | Point cloud downsampling multiplier; downsampling can reduce cycle time but may affect registration accuracy; verify before use | 0.001 | Represents the voxel downsampling process for point clouds; parameter for the interval size between points |
2.2.2 Matching Confidence Threshold (mm)

- Function
The Confidence score for feature point matching. A higher score indicates higher quality feature points, but the number of matched feature points may decrease.
- Applicable Scenarios
Planar Workpiece Ordered Loading/Unloading (Parallelized) scene
- Parameter Description
Default value: 0.01
Value Range: [0.001, 1.0]
2.2.3 Feature Augmentation Count

- Function
Artificially augments the number of feature points based on original feature point detection, preventing matching anomalies caused by too few feature points.
- Applicable Scenarios
Planar Workpiece Ordered Loading/Unloading (Parallelized) scene
- Parameter Description
Default value: 3
Value Range: [0, 99]
2.2.4 Feature Augmentation Range

- Function
The neighborhood range for selecting feature point augmentation
- Applicable Scenarios
Planar Workpiece Ordered Loading/Unloading (Parallelized) scene
- Parameter Description
Default value: 3
Value Range: [0, 99]
- Tuning:
When the scene point cloud quality is good, this parameter can be increased appropriately; conversely, when the scene point cloud quality is poor, decrease this parameter
2.2.7 Maximum Iteration Count

- Function
Limits the maximum iteration count of the algorithm during the coarse matching phase to avoid wasting computational resources due to infinite loops or slow convergence
- Applicable Scenarios
Planar Workpiece Ordered Loading/Unloading (Parallelized) scene
- Parameter Description
Default value: 100
Not recommended to modify; this parameter will be hidden in a future version
2.2.8 Bounding Box Size Coefficient

- Function
Dynamically adjusts the size of the bounding box, controlling the length-width scaling ratio of the detected bounding box.
- Applicable Scenarios
Planar Workpiece Ordered Loading/Unloading (Parallelized) scene
- Parameter Description
Default value: 1.0
- Tuning
To reduce background interference or false merging of adjacent targets, narrow the detection range; set coefficient < 1.0;
To prevent partial targets (e.g., occluded) from being cropped, enlarge the detection range; set coefficient > 1.0;
Adjust this coefficient based on the bounding box results in the 2D recognition output
2.2.9 Enable Depth Features

- Function
Uses features extracted by the SuperGlue model to replace traditional point cloud features, addressing matching anomalies in complex scenes.
- Applicable Scenarios
Planar Workpiece Ordered Loading/Unloading (Parallelized) scene
- Tuning
Suitable for workpieces with smooth surfaces, repetitive textures, or large lighting variations; also suitable for mixed loading/unloading of multiple workpiece types
2.2.10 Enable Edge Features

- Function
When enabled, feature points are extracted and matched only in the edge regions of objects
- Applicable Scenarios
Planar Workpiece Ordered Loading/Unloading (Parallelized) scene
- Parameter Description
Object edge regions (such as contours and sharp transition areas) are used, while flat or uniformly textured regions are ignored.
The use_edge option must also be enabled during template creation to ensure feature consistency.
2.2.11 Coarse Matching Evaluation Threshold (mm)

- Function
During feature point coarse matching, only matching points with a Confidence score above this value are retained.
- Applicable Scenarios
Planar Workpiece Ordered Loading/Unloading (Parallelized) scene
- Parameter Description
Default value: 10
Value Range: [0.1, 1000]
- Tuning
Low threshold (e.g., 5): reduces false matches, but may lose valid points.
High threshold (e.g., 20): retains more matches, but noise increases.
2.2.12 Object Pose Correction

Fine Matching Search Radius (mm)

- Function
During the fine matching process, the template point cloud is matched against the instance point cloud, and each point in the template point cloud searches for its nearest neighbor in the instance point cloud. The fine matching search radius represents both the search radius in the instance point cloud and the distance threshold between each point in the template point cloud and its nearest neighbor in the instance point cloud. If the distance between a point and its nearest neighbor is less than the fine matching search radius, the two points are considered a match; otherwise, they are not.
- Applicable Scenarios
Planar workpiece ordered loading/unloading, planar workpiece unordered grasping, planar workpiece positioning and assembly scenes
- Parameter Description
Default value: 10
Value Range: [1, 500]
Unit: mm
- Tuning
Typically not changed
Fine Matching Search Mode

- Function
The method by which the template point cloud searches for nearest neighbor points in the instance point cloud during fine matching
- Applicable Scenarios
When the fine matching result between the template point cloud and instance point cloud is poor, this parameter should be adjusted
- Parameter Description
| Parameter | Description |
|---|---|
| Point-to-Point | Each point in the template point cloud searches for its nearest neighbor in the instance point cloud (the point with the shortest straight-line distance within the search radius). Applicable to all workpieces |
| Point-to-Plane | Each point in the template point cloud searches for its nearest neighbor in the instance point cloud along its normal vector. Applicable to workpieces with prominent geometric features |
| Combination of Point-to-Point and Point-to-Plane | First optimizes the workpiece pose in the instance point cloud using point-to-point mode, then further optimizes using point-to-plane mode. Applicable to workpieces with prominent geometric features
|
Use Contour Mode

- Function
Extracts the contour point cloud from both the template and instance point clouds for coarse matching
- Applicable Scenarios
Planar workpiece ordered loading/unloading, planar workpiece unordered grasping, planar workpiece positioning and assembly scenes,若使用关键点进行粗匹配的结果不佳,应当勾选该函数使用轮廓点云再次进行粗匹配
- Tuning
The coarse matching result affects the fine matching result. If the fine matching result is poor, you can check Use Contour Mode
Contour Search Range (mm)

- Function
The search radius for extracting the contour point cloud from both the template and instance point clouds
- Applicable Scenarios
General workpiece ordered loading/unloading, general workpiece unordered grasping, and general workpiece positioning and assembly scenes
- Parameter Description
Default value: 5
Value Range: [0.1, 500]
Unit: mm
- Tuning
A smaller value means a smaller search radius for the contour point cloud, suitable for extracting detailed workpiece contours, but the extracted contour may contain outlier noise;
A larger value means a larger search radius, suitable for extracting wider workpiece contours, but the extracted contour may ignore some detailed features.
Save Pose Estimation [Fine Matching] Data

- Function
When checked, the fine matching data is saved
- Applicable Scenarios
Planar workpiece ordered loading/unloading, planar workpiece unordered grasping, planar workpiece positioning and assembly, planar workpiece positioning and assembly (matching only)
- Example
The fine matching data is saved in the project save path\project folder\data\PickLight\historical data timestamp\Builder\pose\output folder.

2.3 Empty ROI Detection

- Function
Determines whether any workpieces (point cloud) remain in the 3D ROI. If the number of 3D points in the 3D ROI is less than this value, it indicates that no workpiece point cloud remains, and no point cloud is returned
- Parameter Description
Default value: 1000
Value Range: [0, 100000]
- Usage Workflow
Set the minimum point count threshold for 3D ROI. When the point count falls below this threshold, it indicates insufficient workpiece point cloud in the 3D ROI, and the system determines that no workpiece is present in the 3D ROI;
In the robot configuration, add a vision status code to facilitate subsequent robot signal processing.
3. Pick Point
This section mainly explains functions related to Pick Point filtering and adjustment, along with Parameter tuning recommendations.
3.1 Pick Point Adjustment
3.1.1 Bound Euler Angle
- Function description
When the Picking Pose is outside the configured angle range, it is rotated counterclockwise by a certain angle around a fixed axis. If it is still outside the configured angle range after rotation, a warning is issued.
- Usage scenario
This function is only applicable to depalletizing scenarios. It can keep the robot's approach direction stable during picking and prevent the end effector from repeatedly rotating during the picking process. In 180° cases, it can prevent exceptions such as cable twisting.
- Parameter description
| Parameter | Description | Default | Range | Unit |
|---|---|---|---|---|
| Fixed axis | An axis of the Picking Pose. The pose is rotated counterclockwise around this fixed axis | Z-axis | X/Y/Z-axis | / |
| Rotation angle | The angle by which the pose is rotated counterclockwise around the fixed axis. Adjust this angle so the Picking Pose satisfies the angle range | 0 | [-360,360] | degree |
| Angle range | The angle range of the Picking Pose. Set the angle range according to factors such as material placement, end effector type, and cycle time | [0,180] | [-180,180] | degree |
| Use current robot Euler Angles | By default, pose calculation uses Euler Angles "XYZ". When selected, the Euler Angles configured for the current robot are used so the pose remains consistent with the robot teach pendant. | Unchecked | / | / |
| Custom coordinate system | The coordinate system of the Picking Pose | Robot arm coordinate system | Default coordinate system; camera coordinate system; ROI coordinate system; robot arm coordinate system | / |
- Example
Without using this function, the generated Pick Points are shown below.

When this function is used with the default values, the RZ angles of the Picking Poses for instances 0, 1, and 2 are all within the angle range [0,180], so no processing is performed. The RZ angle of the Picking Pose for instance 4 is -90°, which is outside the angle range [0,180], so the Picking Pose of instance 4 is rotated by 0° around the fixed Z-axis.
If you want to adjust the RZ angle of the Picking Pose for instance 4 into the angle range, you can change the rotation angle to 180 and rotate the Picking Pose of instance 4 by 180° around the fixed Z-axis.

3.1.2 Rotate the pose to align the rotation axis direction with the target axis direction
- Function description
Rotate the Picking Pose once around the fixed axis so that the direction of the rotation axis (determined by the right-hand rule) matches the positive or negative direction of the target axis in the target coordinate system.
- Usage scenario
Avoid collisions between the robot end effector and the bin.
- Parameter description
| Parameter | Description | Default | Range |
|---|---|---|---|
| Rotation axis | An axis of the Picking Pose. Determined by the right-hand rule, the Picking Pose is rotated counterclockwise once around the fixed axis so that the direction of the rotation axis matches the positive or negative direction of the target axis in the target coordinate system | X-axis | X/Y/Z-axis |
| Fixed axis | The Picking Pose is rotated counterclockwise once around the fixed axis so that the direction of the rotation axis matches the positive or negative direction of the target axis in the target coordinate system | Z-axis | X/Y/Z-axis |
| Target axis | An axis of the target coordinate system. The Picking Pose is rotated counterclockwise once around the fixed axis so that the direction of the rotation axis matches the positive or negative direction of the target axis in the target coordinate system | X-axis | X/Y/Z-axis |
| Negative target axis direction | If selected, the direction of the rotation axis is aligned with the negative direction of the target axis in the target coordinate system; otherwise, it is aligned with the positive direction of the target axis in the target coordinate system | Unchecked | / |
| Custom coordinate system | The coordinate system of the Picking Pose | Default coordinate system | Default coordinate system; camera coordinate system; ROI coordinate system; robot arm coordinate system |
3.1.3 Rotate Pose to Align Axis
- Function description
Rotate the Picking Pose around the fixed axis by 0, 90, 180, and 270 degrees respectively, calculate the angle between the rotated rotation axis and the positive or negative direction of the target axis in the camera coordinate system, and finally output the Picking Pose with the smallest angle after rotation.
- Usage scenario
Avoid collisions between the robot end effector and the bin.
- Parameter description
| Parameter | Description | Default | Range |
|---|---|---|---|
| Fixed axis | An axis of the Picking Pose. Rotate the pose counterclockwise around this fixed axis | Z-axis | X/Y/Z-axis |
| Rotation axis | An axis of the Picking Pose. When rotating the pose, calculate the angle between this rotation axis and the positive or negative direction of the target axis | X-axis | X/Y/Z-axis |
| Target axis | An axis of the camera coordinate system. When rotating the pose, calculate the angle between the rotation axis and the positive or negative direction of this target axis | X-axis | X/Y/Z-axis |
| Negative target axis direction | If selected, calculate the angle between the rotation axis and the negative direction of the target axis; otherwise, calculate the angle between the rotation axis and the positive direction of the target axis | Selected | / |
| Custom coordinate system | The coordinate system of the Picking Pose | Default coordinate system | Default coordinate system; camera coordinate system; ROI coordinate system; robot arm coordinate system |
- Example


3.1.4 Flip Pose to Align Axis
- Function description
Rotate the Picking Pose once around the fixed axis so that the angle formed between the rotation axis and the positive or negative direction of the target axis in the ROI coordinate system is acute.
- Usage scenario
Avoid collisions between the robot end effector and the bin.
- Parameter description
| Parameter | Description | Default | Range |
|---|---|---|---|
| Fixed axis | An axis of the Picking Pose. Rotate the Picking Pose counterclockwise around this fixed axis | Z-axis | X/Y/Z-axis |
| Rotation axis | An axis of the Picking Pose. Rotate the Picking Pose so that the direction of this rotation axis matches the positive or negative direction of the target axis | X-axis | X/Y/Z-axis |
| Target axis | An axis in the ROI coordinate system. Rotate the Picking Pose so that the direction of the rotation axis matches the positive or negative direction of this target axis | X-axis | X/Y/Z-axis |
| Negative target axis direction | If selected, rotate the Picking Pose so that the direction of the rotation axis matches the negative direction of the target axis; otherwise, rotate the Picking Pose so that the direction of the rotation axis matches the positive direction of the target axis | Selected | / |
| Custom coordinate system | The coordinate system of the Picking Pose | Default coordinate system | Default coordinate system; camera coordinate system; ROI coordinate system; robot arm coordinate system |
- Example


3.1.5 Flip Axis to ROI Center
- Function
Rotate the Picking Pose around a fixed axis so that the pointing axis of the Picking Pose points to the ROI center.
- Usage scenario
Avoid collisions between the robot end effector and the bin.
- Parameter description
| Parameter | Description | Default | Range |
|---|---|---|---|
| Pointing axis | The axis in the Picking Pose that needs to be adjusted | X-axis | X/Y/Z-axis |
| Fixed axis | The axis that remains unchanged during rotation | Z-axis | X/Y/Z-axis |
| Reverse align | If selected, reverse-align the pointing axis to the ROI center; otherwise, align the pointing axis to the ROI center | Selected | / |
| Strict pointing | If selected, force the Picking Pose to rotate so the pointing axis points to the ROI center | Unchecked | / |
| Custom coordinate system | The coordinate system of the Picking Pose | Default coordinate system | Default coordinate system; camera coordinate system; ROI coordinate system; robot arm coordinate system |
- Example


3.1.6 Transform to Target Pose

- Function description
Rotate the Picking Pose so that its Z-axis direction matches the Z-axis of the target coordinate system.
- Usage scenario
Usually this is used by default only in depalletizing scenarios and cannot be deleted. It is used to make the Z-axis of the Picking Pose perpendicular to the Z-axis of the ROI coordinate system (4-axis) or consistent with the direction of the Target Object surface (6-axis).
- Parameter description
| Parameter | Description | Default | Range |
|---|---|---|---|
| Robot configuration | Set according to the on-site robot configuration. You can choose 4-axis or 6-axis. If a 6-axis robot is actually used as a 4-axis robot, it should be set to 4-axis | 4-axis | 4-axis/6-axis |
| Use ROI Z-axis as target direction | When the robot configuration is set to 4-axis, if selected, the pose is rotated around the X-axis so that the Z-axis direction of the rotated pose matches the positive direction of the ROI Z-axis ; if not selected, the pose is rotated around the X-axis so that the Z-axis direction of the rotated pose matches the positive direction of the Z-axis of the camera coordinate system . When the robot configuration is set to 6-axis, regardless of whether it is selected, the pose is rotated around the X-axis so that the Z-axis direction of the rotated pose matches the positive direction of the Z-axis of the object's own coordinate system | Unchecked | / |
| Custom coordinate system | The coordinate system of the Picking Pose | Camera coordinate system | Default coordinate system; camera coordinate system; ROI coordinate system; robot arm coordinate system |
- Example
3.1.7 Rotate Pose with Angle
- Function description
Rotate the Picking Pose by a certain angle around a fixed axis.
- Usage scenario
Avoid collisions between the robot end effector and the bin.
- Parameter description
| Parameter | Description | Default | Range | Unit |
|---|---|---|---|---|
| Rotation angle | The angle by which the pose is rotated counterclockwise around the fixed axis | 90 | [-360, 360] | degree° |
| Fixed axis | An axis of the Picking Pose. Rotate the pose counterclockwise around this fixed axis | Z-axis | X/Y/Z-axis | / |
| Custom coordinate system | The coordinate system of the Picking Pose | Default coordinate system | Default coordinate system; camera coordinate system; ROI coordinate system; robot arm coordinate system | / |
- Example


3.1.8 Move Pose With Offset
- Function description
Move the Picking Pose by a certain distance along the translation axis.
- Usage scenario
Avoid collisions between the robot end effector and the bin.
- Parameter description
| Parameter | Description | Default | Range | Unit |
|---|---|---|---|---|
| Translation amount (mm) | The distance the Picking Pose moves along the translation axis. A positive translation amount means translating in the positive direction of the translation axis, and a negative translation amount means translating in the negative direction of the translation axis | 0 | [-1000, 1000] | mm |
| Translation axis | The direction in which the Picking Pose moves | X-axis | X/Y/Z-axis | / |
| Custom coordinate system | The coordinate system of the Picking Pose | Robot arm coordinate system | Default coordinate system; camera coordinate system; ROI coordinate system; robot arm coordinate system | / |
- Example


3.1.9 Transform Pose by Difference
- Function description
Record the Pick Point coordinates generated by the software and the Pick Point coordinates taught under the current operating condition, then output the transformed Picking Pose based on the offset between the two.
- Usage scenario
When the Pick Points generated by the vision system have an obvious systematic offset and the robot TCP coordinate accuracy is limited or difficult to calibrate, this method can be used to directly map the same offset pattern to subsequent Pick Points, thereby avoiding robot TCP calibration.
- Parameter description
| Parameter | Description | Default | Range |
|---|---|---|---|
| Vision Pose | Pick coordinates of the detection result | ||
| X(mm) | X coordinate of the Vision Pose | 0.00 | ±10000000, meaning no limit. |
| Y(mm) | Y coordinate of the Vision Pose | 0.00 | ±10000000, meaning no limit. |
| Z(mm) | Z coordinate of the Vision Pose | 0.00 | ±10000000, meaning no limit. |
| RX(°) | X-axis rotation amount of the Vision Pose | 0.00 | ±180 |
| RY(°) | Y-axis rotation amount of the Vision Pose | 0.00 | ±180 |
| RZ(°) | Z-axis rotation amount of the Vision Pose | 0.00 | ±180 |
| Picking Pose | Manually taught Pick Point | ||
| X(mm) | X coordinate of the Picking Pose | 0.00 | ±10000000, meaning no limit. |
| Y(mm) | Y coordinate of the Picking Pose | 0.00 | ±10000000, meaning no limit. |
| Z(mm) | Z coordinate of the Picking Pose | 0.00 | ±10000000, meaning no limit. |
| RX(°) | X-axis rotation amount of the Picking Pose | 0.00 | ±180 |
| RY(°) | Y-axis rotation amount of the Picking Pose | 0.00 | ±180 |
| RZ(°) | Z-axis rotation amount of the Picking Pose | 0.00 | ±180 |
3.1.10 Refine Pose by Plane Normal
- Function description
Correct the Object Pose by fitting the plane Normal so that the Z-axis direction of the Object Pose remains consistent with the direction of the plane Normal of the Target Object.
- Usage scenario
When the Target Object contains a plane and there is a tilt deviation in the plane when the template Point Cloud is matched with the actual Point Cloud, use this function to fine-tune the Target Object plane and improve picking accuracy.
Not applicable to depalletizing scenarios
- Parameter description
| Parameter | Description | Default | Range | Unit |
|---|---|---|---|---|
| Distance Threshold | Distance Threshold for fitting a plane from the Point Cloud | 10 | [-1000, 1000] | mm |
| Save visualization data | If selected, the visualization data will be saved under the historical data timestamp | Selected | / | / |
| Custom coordinate system | The coordinate system of the Picking Pose | Camera coordinate system | Default coordinate system; camera coordinate system; ROI coordinate system; robot arm coordinate system | / |
- Example
3.1.11 Sort Pick Points by Angle with ROI Axis
- Function
Sort Pick Points according to the angle between an axis of the Picking Pose and the target axis of the ROI.
- Parameter description
| Parameter | Description | Default | Range |
|---|---|---|---|
| Axis selection | An axis of the Picking Pose | Z-axis | X/Y/Z-axis |
| Target axis selection | An axis of the ROI coordinate system | Z-axis | X/Y/Z-axis |
| Select reverse direction | If selected, calculate the angle with the negative direction of the target axis; otherwise, calculate the angle with the positive direction of the target axis | Unchecked | / |
| Select descending order | If selected, sort Pick Points from small to large by angle; otherwise, sort Pick Points from large to small by angle | Unchecked | / |
3.1.12 [Advanced] Rotate Pose, Automatically Compensate for Grasp Orientation Angles with Excessive Deviation from the Specified Axis
- Function description
Determine whether the angle formed between the specified axis of the Picking Pose and the target axis is within the specified range. If not, adjust the Picking Pose into the specified range.
- Usage scenario
Avoid collisions between the robot end effector and the bin.
- Parameter description
| Parameter | Description | Default | Range | Unit |
|---|---|---|---|---|
| Angle range | Adjust the Picking Pose into the angle range | 30 | [0, 180] | degree° |
| Specified axis | An axis of the Picking Pose. Adjust this axis so that it falls within the angle range relative to the target axis of the ROI coordinate system | Z-axis | X/Y/Z-axis | / |
| Target axis | An axis of the ROI coordinate system. Compare the angle range with the specified axis of the Picking Pose | Z-axis | X/Y/Z-axis | / |
| Compare with the negative half-axis of the ROI | If not selected, compare the angle range with the positive direction of the target axis of the ROI coordinate system; if selected, compare the angle range with the negative direction of the target axis of the ROI coordinate system | Unchecked | / | / |
| Custom coordinate system | The coordinate system of the Picking Pose | Default coordinate system | Default coordinate system; camera coordinate system; ROI coordinate system; robot arm coordinate system | / |
3.1.13 [Advanced] Symmetry Center Pose Optimization
- Function
Search for the symmetry center of the Target Object based on the instance Mask, then combine it with the plane of the instance or the pose of the ROI 3D center point to calculate the optimal Picking Pose.
Before using this function, first make sure the instance Mask is symmetrical
- Usage scenario
Applicable when the instance Mask of a symmetrical Target Object is also symmetrical, but the Picking Pose is not near the expected center; at the same time, the Target Object has a plane that can be used as a reference, for example, there is a plane on the top of the object, or ROI 3D can be used as a reference for the projected pose.
Applicable project scenarios include brake discs (general circles), refractory bricks (depalletizing), symmetrical irregular parts, fuel fillers, and so on.
- Parameter description
| Parameter | Description | Default | Range | Tuning recommendation |
|---|---|---|---|---|
| Target Object Symmetry type | Target Object Symmetry type of the instance Mask | Rotational symmetry | Rotational symmetry: after the Target Object rotates by a certain angle around the center point, its shape completely overlaps with the original position; mirror symmetry: the Target Object uses a certain axis / plane as the mirror, and the left-right or upper-lower sides are completely symmetrical. | Circles and rectangles are both rotationally symmetrical and mirror-symmetrical, so rotational symmetry is preferred; for trapezoids and other shapes that are symmetrical only along a certain axis or plane, choose mirror symmetry. |
| Gaussian blur level | Tolerance for determining whether the actual Point Cloud overlaps after rotation | 3 | \[1,99\] |
|
| Rotation angle setting | When the symmetry mode is rotational symmetry, it indicates the rotation angle interval, that is, the angle difference between two adjacent rotations. When the symmetry mode is mirror symmetry, it indicates the rotation range, that is, the angle interval within which the Point Cloud can rotate around the symmetry axis. | 180 | \[1,360\] |
|
| Image scaling ratio | Adjusts the size of the Point Cloud image. The larger this ratio, the smaller the Point Cloud image size and the lower the GPU memory usage, but image detail loss increases, resulting in reduced calculation accuracy . | 2 | \[1,10000000\] | |
| Search range | Based on the initially determined center of the Target Object, this defines the range expanded outward to search for Point Cloud features. The actual range is (search range*2*image scaling ratio) | 10 | \[1,10000000\] | For example, for a square Target Object, the initially determined center position of the Target Object is point O. If the search range is set to 10 and the image scaling ratio is 1, then the actual search range is a square region centered at point O with a side length of 10×2×1=20. Point Cloud features are searched within this region to further determine the symmetry center of the Target Object and the optimal Picking Pose. As another example, for a circular Target Object, if the search range is set to 8 and the image scaling ratio is 2, then the actual search range is a circular region centered at the initially determined center of the Target Object with a diameter of 8×2×2=32. Point Cloud features are searched within this region to further determine the Object Pose of the Target Object and the optimal Picking Pose. |
| Use ROI3D as the reference projection plane | If selected, ROI3D is used as the reference projection plane | Unchecked | / | Select this when the Point Cloud has no obvious plane and the projection plane is difficult to determine; leave it unchecked when the Point Cloud has a clear plane. |
| Save symmetry center process data | If selected, the debug data generated during the symmetry center process is saved. You can view it in the `\ProjectName`\data`\PickLight`\HistoricalDataTimestamp`\find`\_symmetry_center folder | Unchecked | / | Select this when you need to inspect the detailed process images |
| Symmetry axis prior type | Effective in ``{=html}mirror symmetry``{=html} mode. Specifies the known Target Object Symmetry type and fixes the asymmetric orientation | Automatic search | Automatic searchSymmetric along the long axisSymmetric along the short axis | If the symmetry axis of the Target Object is the long axis, choose "Symmetric along the long axis". If the symmetry axis of the Target Object is the short axis, choose "Symmetric along the short axis". If uncertain, choose "Automatic search" |
| Pose adjustment type | Whether to inherit pose-related information from the input pose | Default pose | Default poseInherit rotationInherit translation | / |
| Symmetry score Threshold | Symmetry results with a symmetry score lower than this Threshold are abnormal results. When set to 0, no filtering is performed | 0.0 | \[0.0, 1.0\] | / |
- Example
3.2 Pick Point Filtering
3.2.1 Filter by Fine Matching Score
- Function description
Filter Pick Points based on the pose fine matching score.
- Parameter description
| Parameter | Description | Default | Range |
|---|---|---|---|
| Score Threshold | Retain Pick Points whose fine matching score is greater than this Threshold | 0.5 | [0, 1] |
3.2.2 Filter grasp points of occluded artifacts
- Function description
Determine whether there are too many occluding object Point Clouds in the target detection area along the specified ROI axis or the Picking Pose axis at the Pick Point of the grasped Target Object. If so, the Target Object is considered occluded and the Pick Point is filtered out.
- Usage scenario
Applicable to depalletizing and ordered scenarios in which Target Objects are picked layer by layer, but the model recognizes lower-layer Target Objects. When picking a lower-layer Target Object, the gripper may collide with the upper-layer Target Object.
- Parameter description
| Parameter | Description | Default | Range | Unit |
|---|---|---|---|---|
| Cuboid length in X direction | Set the cuboid length in the X direction of the Picking Pose | 1500 | [1, 10000] | mm |
| Cuboid length in Y direction | Set the cuboid length in the Y direction of the Picking Pose | 1500 | [1, 10000] | mm |
| Cuboid length in Z direction | Set the cuboid length in the Z direction of the Picking Pose | 800 | [1, 10000] | mm |
| Distance Threshold between detection area and Pick Point origin | Along the ROI axis, the nearby cuboid surface area farther than this distance Threshold from the Pick Point origin is regarded as the target detection area | 50 | [1, 1000] | mm |
| Point Cloud count Threshold in detection area | If the number of occluding object Point Clouds in the target detection area exceeds this Threshold, the Pick Point is considered occluded | 1000 | [0, 100000] | / |
| Specified axis direction | Based on the pose reference specified axis direction, set the specific location of the target detection area within the cuboid space (for example, near the front/back/left/right/top/bottom surface of the cuboid) | [0,0,-1] | [1,0,0]: positive X-axis[-1,0,0]: negative X-axis[0,1,0]: positive Y-axis[0,-1,0]: negative Y-axis[0,0,1]: positive Z-axis[0,0,-1]: negative Z-axis | / |
| Use ROI 3D pose reference | If selected, adjust the collision detection area according to the ROI 3D pose reference | Unchecked | / | / |
| Save visualization data | If selected, the visualization data is stored according to the saved data path to help observe whether the generated cuboid is reasonable; if not selected, it is not saved | Unchecked | / | / |
3.2.3 Filter Pose by Cone
- Function description
Determine whether the angle of the Picking Pose is within the constrained angle range, and filter out all Pick Points that do not meet the condition.
- Usage scenario
Prevent collisions caused by abnormal robot arm Picking Pose angles.
- Parameter description
| Parameter | Description | Default | Range | Unit |
|---|---|---|---|---|
| Angle filtering Threshold | Calculate the maximum angle between the specified axis of the ROI and the specified axis of the Picking Pose. Pick Points whose angle is greater than the current Threshold will be filtered out | 30 | [-360, 360] | degree° |
| Invert ROI specified axis direction | If selected, use the negative direction of the specified ROI axis for angle calculation; otherwise, use the positive direction of the specified ROI axis for angle calculation | Selected | / | / |
| Specified Picking Pose axis | Specify an axis of the Picking Pose for angle calculation | Z-axis | X/Y/Z-axis | / |
| Specified ROI axis | Specify an axis of the ROI coordinate system for angle calculation | Z-axis | X/Y/Z-axis | / |
3.2.4 Filter Pick Points outside the ROI 3D type area

- Function description
Determine whether the Pick Point is within the ROI 3D range, and remove Pick Points that are outside the ROI 3D area.
- Usage scenario
Prevent picking outside the ROI area, which may cause collisions between the robot arm and the target object.
- Parameter description
| Parameter | Description | Default |
|---|---|---|
| ROI3D type region | Usually "workspace"; "pick area" is a smaller ROI region than "workspace", which can restrict Pick Points to an ROI region smaller than the "workspace" to avoid some collision cases. | Workspace |
- Example
As shown in the figure below, when the ROI3D area and ROI2D are area a, the corresponding Pick Point is in the upper-right corner.



When the ROI3D area and ROI2D are changed to area b, the original Pick Point is outside the ROI area, so that Pick Point is removed and a new Pick Point is generated within area b.



3.2.5 [New] Gripper Collision Detection
[New] Gripper Collision Detection
- Function description
Collision detection between the gripper and the Point Cloud near the Pick Point. If the number of Point Clouds in contact with the gripper exceeds the pick collision Threshold, the Pick Point of the Target Object is considered to have a collision risk.
- Usage scenario
Used when collision detection is required between the gripper and the Point Cloud near the Target Object being picked.
- Parameter description
| Parameter | Description | Default | Range |
|---|---|---|---|
| Collision Threshold | Collision distance Threshold. If the distance between the scene and the gripper surface is smaller than this Threshold, it is considered a collision. The larger the Threshold, the stricter it is. Unit: mm | 7 | 1-1000 |
| Collision Point Cloud sampling | Sampling size for collision Point Clouds. The larger the value, the faster the cycle time; the smaller the value, the slower the cycle time. Effective only in "Target Object scene Point Cloud only" and "bin + Target Object scene Point Cloud" modes. Unit: mm | 5 | 1 - 1000 |
| Save visualization data for gripper collision detection | Save visualization data for collision detection between the gripper and the picked Target Object | Unchecked | Selected/Unchecked |
Gripper Collision Detection
- Function description
Collision detection between the gripper and the Point Cloud near the Pick Point. If the number of Point Clouds in contact with the gripper exceeds the pick collision Threshold, the Pick Point of the Target Object is considered to have a collision risk.
- Usage scenario
Used when collision detection is required between the gripper and the Point Cloud near the Target Object being picked.
- Parameter description
| Parameter | Description | Default | Range |
|---|---|---|---|
| Pick collision Threshold | The maximum number of Point Clouds the gripper may contain near the Pick Point. For example, 20 means that if the number of scene Point Clouds contained by the gripper exceeds 20, it is considered a collision | 20 | 0-10000 |
| Collision Point Cloud sampling (m) | Downsampling size of the Point Cloud in the collision area. The larger the value, the faster the detection speed, but the lower the accuracy. Applicable scenario: scenarios requiring high cycle rates | 0.002 | 0.0001 - 0.5000 |
| Save visualization data for gripper collision detection | Save visualization data for collision detection between the gripper and the picked Target Object | Unchecked | Selected/Unchecked |
| Import gripper model | Select and import the gripper model used for collision detection from a folder | / | / |
**The gripper should be simplified to fewer than 500 faces**
3.2.6 [Master] Retain the one Pick Point with the largest/smallest pose value among instance Pick Points and filter the remaining Pick Points
- Function description
Convert the pose to the specified coordinate system, sort poses according to the value of the specified sorting axis, and retain the pose with the maximum or minimum value. This is suitable for cylindrical Target Objects when keeping the top or bottom Pick Point.
- Parameter description
| Parameter | Description | Default | Range |
|---|---|---|---|
| Specified coordinate system | Select which coordinate system the pose should be converted to for processing | ROI coordinate system | ROI coordinate system/camera coordinate system |
| Specified sorting axis | Select which axis value of the pose to sort by | Z-axis | X/Y/Z-axis |
| Take minimum value | If selected, retain the pose with the minimum value on the sorting axis; otherwise, retain the pose with the maximum value on the sorting axis | Unchecked | / |
3.2.7 [Master] Filter grasp points similar to the previous N grasp points
- Function description
If the variation between the current Pick Point and any Pick Point in the cache is within the Threshold range, the Pick Point will be filtered out.
- Parameter description
| Parameter | Description | Default | Range | Unit |
|---|---|---|---|---|
| Upper limit of Pick Point change (+) | ||||
| X(mm) | Upper limit of X coordinate | 2 | [0, 10000000] | mm |
| Y(mm) | Upper limit of Y coordinate | 2 | [0, 10000000] | mm |
| Z(mm) | Upper limit of Z coordinate | 2 | [0, 10000000] | mm |
| RX(°) | Upper limit of RX rotation amount | 1 | [0, 180] | degree° |
| RY(°) | Lower limit of RY rotation amount | 1 | [0, 180] | degree° |
| RZ(°) | Lower limit of RZ rotation amount | 1 | [0, 180] | degree° |
| Lower limit of Pick Point change (-) | ||||
| X(mm) | Lower limit of X coordinate | 2 | [0, 10000000] | mm |
| Y(mm) | Lower limit of Y coordinate | 2 | [0, 10000000] | mm |
| Z(mm) | Lower limit of Z coordinate | 2 | [0, 10000000] | mm |
| RX(°) | Lower limit of RX rotation amount | 1 | [0, 180] | degree° |
| RY(°) | Lower limit of RY rotation amount | 1 | [0, 180] | degree° |
| RZ(°) | Lower limit of RZ rotation amount | 1 | [0, 180] | degree° |
| Pick Point cache count | Number of Pick Points cached. After the current Pick Point comparison is completed, it will be added to the cache in real time | 5 | [1, 100] | / |
3.2.8 [Master] Filter artifact poses similar to the previous N artifact poses
- Function description
If the variation between the current Object Pose and any Object Pose in the cache is within the Threshold range, the Object Pose will be filtered out. When an Object Pose is determined to be similar, all Pick Points on that Target Object will be filtered out.
- Parameter description
| Parameter | Description | Default | Range | Unit |
|---|---|---|---|---|
| Upper limit of Object Pose change (+) | ||||
| X(mm) | Upper limit of X coordinate | 2 | [0, 10000000] | mm |
| Y(mm) | Upper limit of Y coordinate | 2 | [0, 10000000] | mm |
| Z(mm) | Upper limit of Z coordinate | 2 | [0, 10000000] | mm |
| RX(°) | Upper limit of RX rotation amount | 1 | [0, 180] | degree° |
| RY(°) | Lower limit of RY rotation amount | 1 | [0, 180] | degree° |
| RZ(°) | Lower limit of RZ rotation amount | 1 | [0, 180] | degree° |
| Lower limit of Object Pose change (-) | ||||
| X(mm) | Lower limit of X coordinate | 2 | [0, 10000000] | mm |
| Y(mm) | Lower limit of Y coordinate | 2 | [0, 10000000] | mm |
| Z(mm) | Lower limit of Z coordinate | 2 | [0, 10000000] | mm |
| RX(°) | Lower limit of RX rotation amount | 1 | [0, 180] | degree° |
| RY(°) | Lower limit of RY rotation amount | 1 | [0, 180] | degree° |
| RZ(°) | Lower limit of RZ rotation amount | 1 | [0, 180] | degree° |
| Object Pose cache count | Number of vision Object Poses cached. After the comparison of the current Object Pose is completed, it will be added to the cache in real time | 5 | [1, 100] | / |
3.2.9 [Master] Filter grasp points outside the upper and lower limits of the grasp coordinates
- Function description
Retain other Pick Points within the specified range of a reference Pick Point and filter out abnormal Pick Points.
- Usage scenario
Prevent incorrect robot picking and ensure picking accuracy.
This function is not applicable to depalletizing scenarios
- Parameter description
| Parameter | Description | Default | Unit |
|---|---|---|---|
| Reference Pick coordinates | |||
| X(mm) | X coordinate of the reference Pick Point | 0 | mm |
| Y(mm) | Y coordinate of the reference Pick Point | 0 | mm |
| Z(mm) | Z coordinate of the reference Pick Point | 0 | mm |
| RX(°) | RX rotation amount of the reference Pick Point | 0 | degree |
| RY(°) | RY rotation amount of the reference Pick Point | 0 | degree |
| RZ(°) | RZ rotation amount of the reference Pick Point | 0 | degree |
| Upper limit of Pick coordinates (+) | |||
| X(mm) | Upper limit of the X coordinate. For example, if the X coordinate of the reference Pick Point is 100 and the upper limit is set to 10, the allowed range is: [100-lower limit, 110] | 10000000, meaning no limit. | mm |
| Y(mm) | Upper limit of the Y coordinate. For example, if the Y coordinate of the reference Pick Point is 100 and the upper limit is set to 10, the allowed range is: [100-lower limit, 110] | 10000000 | mm |
| Z(mm) | Upper limit of the Z coordinate. For example, if the Z coordinate of the reference Pick Point is 100 and the upper limit is set to 10, the allowed range is: [100-lower limit, 110] | 10000000 | mm |
| RX(°) | Upper limit of the RX rotation amount. For example, if the RX rotation amount of the reference Pick Point is 180 and the upper limit is set to 10, the allowed range is (default angle wraparound applies): [[-180, -170], [180-lower limit, 180]] | 180, meaning no limit. | degree° |
| RY(°) | Upper limit of the RY rotation amount. For example, if the RY rotation amount of the reference Pick Point is 180 and the upper limit is set to 10, the allowed range is (default angle wraparound applies): [[-180, -170], [180-lower limit, 180]] | 180 | degree° |
| RZ(°) | Upper limit of the RZ rotation amount. For example, if the RZ rotation amount of the reference Pick Point is 180 and the upper limit is set to 10, the allowed range is (default angle wraparound applies): [[-180, -170], [180-lower limit, 180]] | 180 | degree° |
| Lower limit of Pick coordinates (-) | |||
| X(mm) | Lower limit of the X coordinate. For example, if the X coordinate of the reference Pick Point is 100 and the lower limit is set to 10, the allowed range is: [100-lower limit value, 110] | 10000000 | mm |
| Y(mm) | Lower limit of the Y coordinate. For example, if the Y coordinate of the reference Pick Point is 100 and the lower limit is set to 10, the allowed range is: [100-lower limit, 110] | 10000000 | mm |
| Z(mm) | Lower limit of the Z coordinate. For example, if the Z coordinate of the reference Pick Point is 100 and the lower limit is set to 10, the allowed range is: [100-lower limit, 110] | 10000000 | mm |
| RX(°) | Lower limit of the RX rotation amount. For example, if the RX rotation amount of the reference Pick Point is 180 and the lower limit is set to 10, the allowed range is (default angle wraparound applies): [[-180, -180+upper limit], [170, 180]] | 180, meaning no limit. | degree° |
| RY(°) | Lower limit of the RY rotation amount. For example, if the RY rotation amount of the reference Pick Point is 180 and the lower limit is set to 10, the allowed range is (default angle wraparound applies): [[-180, -180+upper limit], [170, 180]] | 180 | degree° |
| RZ(°) | Lower limit of the RZ rotation amount. For example, if the RZ rotation amount of the reference Pick Point is 180 and the lower limit is set to 10, the allowed range is (default angle wraparound applies): [[-180, -180+upper limit], [170, 180]] | 180 | degree° |
3.3 Pick Point Sorting
3.3.1 Base Coords

- Function description
Set a unified coordinate system for all instances to group and sort instances.
- Usage scenario
Common to depalletizing scenarios, random picking scenarios, and ordered loading/unloading scenarios
Strategies related to coordinates should first set the reference coordinate system
- Parameter description
| Parameter | Description | Illustration |
|---|---|---|
| Camera coordinate system | The coordinate system origin is above the object, and the positive Z-axis direction points downward; the XYZ values are the values of the center point of the object in this coordinate system | ![]() |
| ROI coordinate system | The coordinate system origin is approximately at the center of the pallet stack, and the positive Z-axis direction points upward; the XYZ values are the values of the center point of the object in this coordinate system | ![]() |
| Robot arm coordinate system | The coordinate system origin is on the robot arm itself, and the positive Z-axis direction generally points upward; the XYZ values are the values of the center point of the object in this coordinate system | ![]() |
| Pixel coordinate system | The coordinate system origin is at the top-left vertex of the RGB image and is a 2D planar coordinate system; the X and Y values are the x value of the bbox detection box and the y value of the bbox detection box, and Z is 0 | ![]() |
3.3.2 Picking strategy
- Parameter description
| Parameter | Description |
|---|---|
| Strategy | Select which value is used for grouping and sorting and how to sort it, including Pick Point center X/Y/Z coordinate values from large to small/from small to large (mm), from the middle to the sides / from the sides to the middle along the Pick Point XY coordinate axis (mm). Multiple items can be superimposed and executed in order. |
| Grouping step size | According to the selected strategy, divide Pick Points into several groups based on the step size. The grouping step size is the interval between two groups of Pick Points |
| Number of leading groups to keep | After grouping and sorting, how many groups of instances need to be retained |
| Strategy name* | Description | Grouping step size | Number of leading groups to keep | |
|---|---|---|---|---|
| Default | Range | Default | ||
| Pick Point center X/Y/Z coordinate values from large to small / from small to large (mm) | Use the X/Y/Z coordinate values of the Pick Point center for grouping and sorting | 200.000 | [0, 10000000] | 10000 |
| From the middle to the sides / from the sides to the middle along the Pick Point XY coordinate axis (mm) | Use the X/Y coordinate values of the Pick Point center and perform grouping and sorting in the direction of "middle to sides" or "sides to middle" | 200.000 | [0, 10000000] | 10000 |
3.3.3 Carton combination strategy
To solve the problems of low efficiency and limited applicable scenarios in traditional single-pick depalletizing, PickWiz adds a carton combination strategy in depalletizing scenarios to support picking multiple Target Objects in a single operation. It supports the core scenarios of "cartons with consistent dimensions" and "rectangular suction cups", covering more real project scenarios.

3.3.3.1 Multi-pick runtime configuration
(1)In sack single depalletizing or carton single depalletizing scenarios, enable Vision computation configuration - Vision computation acceleration.
(2)Under the Pick Point sorting module, select the carton combination strategy;

(3)Strategy selection: available options are the default combination strategy or combination along a specified carton pose axis. These are two methods for finding the largest number of cartons that can be combined.

Default combination strategy: Find the largest number of cartons that can be combined along the X-axis and Y-axis directions of a carton Picking Pose.
Combine along a certain carton pose axis: Find the largest number of cartons that can be combined along the X-axis or Y-axis direction of a carton Picking Pose. This is suitable for scenarios where cartons are arranged in a straight line. When using this strategy, you need to choose the carton combination direction, namely the Picking Pose X-axis or the Picking Pose Y-axis.

Note:
The combination direction is related to the positive and negative axis directions. Cartons can be combined only when they are on the same axis and in the same direction, and after combination, the orientation of the whole stack of cartons remains consistent with the orientation of a single carton before combination. Therefore, before combining cartons, make sure all cartons to be combined are placed in the same orientation, and unify the coordinates of the cartons to be combined to the same axis direction.
(4)Combination conditions: determine which cartons can be combined and how many can be combined at most.
Maximum cartons per row: the maximum number of cartons that can be combined in one row, default is 2.
Maximum number of combination rows: the maximum number of carton rows that can be combined, default is 1.
Maximum spacing (mm): cartons to be combined cannot be too far apart in the "combination direction", In the combination direction (axis direction), when the spacing between two adjacent cartons or cartons in different rows is less than this value, they can be combined into one group. The default is 10.
- Example: when searching for the maximum number of cartons along the Picking Pose X-axis, if the spacing between two adjacent cartons in the Picking Pose X-axis direction is 8 mm (≤10), they can be combined; if the spacing is 12 mm (>10), they cannot be combined.
Maximum misalignment distance (mm): cartons to be combined cannot be too far apart in the direction "perpendicular to the combination direction" . In the direction perpendicular to the combination direction (axis direction), when the misalignment distance between two adjacent cartons or cartons in different rows is less than this value, they can be combined into one group. The default is 10.
- Example: when searching for the maximum number of cartons along the Picking Pose X-axis, if two adjacent cartons are offset in the Picking Pose Y-axis direction by 8 mm (≤10), they can be combined; if they are offset by 15 mm (>10), they are no longer aligned and cannot be picked together.
Maximum angular deviation (°): cartons to be combined should face almost the same direction. In the combination direction (axis direction), when the rotational deviation angle of the cartons is less than this value, they can be combined into one group. The default is 10.
- Example: if a carton is rotated by 5° relative to the combination direction, as long as it does not exceed 10°, it can be combined; if it is rotated by 15° (>10), the orientation differs too much, the robot will be skewed when picking, and it cannot be combined.

3.3.3.2 Robot configuration
(1)On the robot configuration page, add new placeholders in Vision computation communication message - Robot to PickWiz commands - Vision detection send command: maximum cartons per row and maximum number of combination rows, as shown below;

(2)In Vision computation communication message - PickWiz to robot commands - Pick-related information - Returned information when picking Target Objects, add Object Dimensions length, Object Dimensions width, and Target Object orientation.

After the robot configuration is completed, click the Run button.
3.3.3.3 View multi-pick runtime results
(1)In the 3D Matching window, hover the mouse over an instance to view the combined picking information of a single instance after carton combination, including 2D recognition results, Picking Pose, and instance combination information.

In the visualization window, click the Settings button in the upper right corner to set how the combined instance information is displayed.

Right-click an instance to view the combined picking information and Target Object information of the single instance.

(2)In the 2D recognition window, you can use the relevant combination buttons in the menu bar to view the combined ID, combined Mask, and combined bounding box.














