Add Depthai SuperPoint by borongyuan · Pull Request #1058 · introlab/rtabmap

borongyuan · 2023-06-17T09:03:35Z

SuperPoint MyriadX Blob:
https://github.com/borongyuan/depthai-superpoint/raw/main/blobs/superpoint_200x320_10shave.blob

borongyuan · 2023-06-19T07:48:31Z

Finally, it works. At first, I was worried about whether there would be too few keypoints with 320×200 resolution. But the results show that when prob_threshold is 0.01, there are about 400 keypoints in most scenes after NMS. Descriptors also perform well and can detect loop closure correctly. Image rate is limited to a maximum of 15 FPS when on-device superpoint is enabled. Now the network is running using 10 SHAVEs, 1x Thread (2NCE/Thread) for low latency. It should be just fine for 15 FPS. But I noticed that when other features are configured on OAK (such as some depth post-processing filters), the performance of neural network inference may also be affected. Although their resources are allocated separately. I don't know exactly what factors are involved. The current OAK resource configuration in rtabmap should be reasonable. If there is instability in the long-running test later, such as frame drop, we will consider running the network using 5 SHAVEs, 2x Threads (1NCE/Thread) to ensure 15 FPS. To do this change L370 and L371 to

                superPointNetwork->setNumInferenceThreads(2);
                superPointNetwork->setNumNCEPerInferenceThread(1);

Information on low latency and max FPS can be found here.

ONNX conversion is in my fork of superpoint_infer_engine. If there is nothing to change, just use the converted blobs from depthai-superpoint. I only made some changes to the outputs of the model. For the Detector Head, Softmax is done and heatmap is reshaped. For the Descriptor Head, L2 Norm is removed to avoid FP16 overflow issue. We perform L2 Norm on the host. Descriptor output is converted to NHWC, so it can be easily converted to cv::Mat via L632. Here is the entire network visualized using Netron.

matlabbe · 2023-06-20T01:57:14Z

@borongyuan great work! I tried it, it kinda work but and I am not sure I get the right results. It seems it like to extract many keypoints from the same locations:

While odometry seems working, loop closure detection doesn't work very well, in contrast to original superpoint I integrated. I'll try to setup this week my computer with pytorch to see the difference.

borongyuan · 2023-06-20T03:31:00Z

The heatmap is resized to original image size using bicubic interpolation for better keypoint location, even though the original heatmap has a lower resolution. This is a trick used by some other networks with heatmaps. Then after the mask of prob_threshold, multiple keypoints will be extracted from the same location with different prob/response. NMS should select most significant ones. So when using high-resolution input, such as 800P, the nms distance should be increased appropriately. I also noticed that with the default nms distance (4), in some scenes using 400P looks better than using 800P.
But what confuses me is your screenshot, is it cropped? Keypoints should not appear close to image borders. I didn't add code to filter them. I just noticed that the raw network output doesn't respond to image borders. Also I use cv::remap() to get descriptor vectors, which is equivalent to torch::grid_sampler(). The coordinates are calculated using the method from this post. After calculation, it can be noticed that the border pixels in original image correspond to the "outliers" in coarse_desc. So even if keypoints appear close to image borders, their descriptors will be set to zeros by default.
Also please make sure you downloaded the latest blob file, because I made an update before submitting this PR. If it works correctly, you should see the successful loop closure like the image below. In fact, I still don't fully understand how the UI displays these keypoints types. Is it just based on the length and type of descriptors?

borongyuan · 2023-06-20T04:03:41Z

I think limited by the computing power of OAK Series 2, the current speed and accuracy should be just enough. So I am also looking forward to OAK Series 3, hoping to get better results then. I will try other features later when I have time. For example, KeyNet, AffNet, and HardNet, which are included in kornia.feature. According to the DepthAI documentation, some algorithms in kornia may be converted to blobs. But it is good to use superpoint for now. On-device GFTT + RootSIFT is also a good choice. I will study other things later.

matlabbe · 2023-06-21T04:26:14Z

I did some tests. I cannot make VO working with superpoint generated on oak-d at 800p (it gets lost as soon as I move). It works well at 400p though. I tried increasing NMS distance on 800p, but it didn't really help. My image was indeed cropped in my post above. Here some results and comparison with pytroch version:

800p on device superpoint (nms=4):
800p on device superpoint (nms=8):
800p torch superpoint (nms=4):
400p on device superpoint (nms=4):
400p torch superpoint (nms=4):

Color code can be seen by mouse over the loop closure ID or the odometry view:

matlabbe · 2023-06-21T04:58:29Z

Results at 400p are very promising, thanks again for the PR!

borongyuan · 2023-06-21T05:51:15Z

The fundamental limitation is that it runs at 320×200 resolution on the device. Otherwise the frame rate will not be sufficient for VO. Even if we can locate keypoints on the original image by resize, the resolution of coarse_desc is only 40×25. So the performance is definitely not comparable to inference on raw images. I'm currently treating this as an experimental feature, as it's really interesting to implement. But it’s also possible to improve it later. In addition to waiting for the next generation of hardware, we can also try to simplify the model. A simple experience is that for VGG-like models, MobileNetV1 can be used instead, and for ResNet-like models, MobileNetV2 can be used. I saw someone tried V2 before and it didn't work well, maybe they should try V1. Another possibility is that we only use SuperPoint for loop closure detection. Then it can run on device with higher resolution input, since it doesn’t need to process every frame. I'm not sure if this is convenient to implement.

For NMS(), I modified it slightly when I moved it to util2d. It seems that after the response of Keypoints is set, there is no need to use conf Mat as input. I don't know if it's related to the last few issues. Correct me if I am wrong.

matlabbe · 2023-06-22T04:26:24Z

Another possibility is that we only use SuperPoint for loop closure detection. Then it can run on device with higher resolution input, since it doesn’t need to process every frame. I'm not sure if this is convenient to implement.

For standalone, it may not be useful unless we combine with external VO/VIO like T265, as rtabmap's vo needs high framerate. I see it maybe more useful on ROS on a robot using wheel/lidar odometry, then low frame rate OAK's image with superpoint could be used for loop closure detection.

For NMS, I checked and the issue was there before your PR, so I just modified after merging your PR. The confidence matrix is used to get confidence of other keypoints around a fixed radius of the current keypoint checked. This was a code taken from this project, this could be indeed optimized using a different approach (like using nearest neighbor index).

borongyuan · 2023-06-22T05:07:08Z

For standalone, it may not be useful unless we combine with external VO/VIO like T265, as rtabmap's vo needs high framerate. I see it maybe more useful on ROS on a robot using wheel/lidar odometry, then low frame rate OAK's image with superpoint could be used for loop closure detection.

I mean use other features for VO. SuperPoint is only used for loop closure. It should still be possible, right? You said before that odometry won't re-detect keypoints when setFeatures() is used. But now we want it to re-detect.

For NMS, I checked and the issue was there before your PR, so I just modified after merging your PR. The confidence matrix is used to get confidence of other keypoints around a fixed radius of the current keypoint checked. This was a code taken from this project, this could be indeed optimized using a different approach (like using nearest neighbor index).

There is a clever simple_nms() implementation in SuperGlue, which acts directly on the Heatmap. This issue discusses both implementations. But I can't use this implementation because maxpooling in OpenCV is not a standalone function.

matlabbe · 2023-06-22T05:22:48Z

I expected that image framerate would be lower when superpoint are extracted, but if image frame rate is 30 Hz, while superpoint is 5 Hz, send images at 30 Hz for VO thread (extracting other features), and send superpoint features to rtabmap at 5Hz. The issue is synchronization of odometry with superpoint received with latency on the other thread. At that point, I would use rtabmap_ros to redirect and synchronize the data.

For SuperGlue's NMS, I guess that pytorch version could be converted to cpp torch.

borongyuan added 3 commits June 15, 2023 14:02

update to depthai-core v2.22.0

e219152

move NMS to util2d

2dd64a6

add depthai superpoint detector

55a84cf

borongyuan marked this pull request as ready for review June 19, 2023 06:34

add depthai superpoint descriptor

720d50f

borongyuan force-pushed the depthai-superpoint branch from 8d0aa21 to 720d50f Compare June 19, 2023 07:15

matlabbe mentioned this pull request Jun 21, 2023

Superpoint seg fault NMS #1064

Closed

matlabbe merged commit d901fb2 into introlab:master Jun 21, 2023

borongyuan deleted the depthai-superpoint branch June 21, 2023 06:35

borongyuan mentioned this pull request Sep 3, 2023

[Feature-Request] ANMS #1127

Closed

matlabbe mentioned this pull request Oct 6, 2023

Improving DepthAI integration introlab/rtabmap_ros#1045

Open

borongyuan mentioned this pull request Oct 18, 2023

D455 trouble with OpenVins addon, need help #1144

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Depthai SuperPoint#1058

Add Depthai SuperPoint#1058
matlabbe merged 4 commits intointrolab:masterfrom
borongyuan:depthai-superpoint

borongyuan commented Jun 17, 2023 •

edited

Loading

Uh oh!

borongyuan commented Jun 19, 2023 •

edited by matlabbe

Loading

Uh oh!

matlabbe commented Jun 20, 2023 •

edited

Loading

Uh oh!

borongyuan commented Jun 20, 2023

Uh oh!

borongyuan commented Jun 20, 2023 •

edited

Loading

Uh oh!

matlabbe commented Jun 21, 2023

Uh oh!

matlabbe commented Jun 21, 2023

Uh oh!

borongyuan commented Jun 21, 2023

Uh oh!

matlabbe commented Jun 22, 2023

Uh oh!

borongyuan commented Jun 22, 2023

Uh oh!

matlabbe commented Jun 22, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

borongyuan commented Jun 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

borongyuan commented Jun 19, 2023 • edited by matlabbe Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

matlabbe commented Jun 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

borongyuan commented Jun 20, 2023

Uh oh!

borongyuan commented Jun 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

matlabbe commented Jun 21, 2023

Uh oh!

matlabbe commented Jun 21, 2023

Uh oh!

borongyuan commented Jun 21, 2023

Uh oh!

matlabbe commented Jun 22, 2023

Uh oh!

borongyuan commented Jun 22, 2023

Uh oh!

matlabbe commented Jun 22, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

borongyuan commented Jun 17, 2023 •

edited

Loading

borongyuan commented Jun 19, 2023 •

edited by matlabbe

Loading

matlabbe commented Jun 20, 2023 •

edited

Loading

borongyuan commented Jun 20, 2023 •

edited

Loading