Skip to content

[WIP] Implementation of SuperGlue#25697

Closed
sbucaille wants to merge 2 commits intohuggingface:mainfrom
sbucaille:add_superglue
Closed

[WIP] Implementation of SuperGlue#25697
sbucaille wants to merge 2 commits intohuggingface:mainfrom
sbucaille:add_superglue

Conversation

@sbucaille
Copy link
Contributor

@sbucaille sbucaille commented Aug 23, 2023

What does this PR do?

This PR implements the SuperGlue model.

#25489

Who can review?

@amyeroberts

Todo's

  • Adapt the template code to match SuperGlue
  • Write a conversion script
  • Adding model tests
  • Add docstring
  • Upload models to hub

@ArthurZucker
Copy link
Collaborator

Hey! Thanks a lot for contributing! 🚀 as a first step I would suggest you to upload the model on the hub following this tutorial, as it will make the process a lot easier, and won't need to go through our strict CIs! 🤗

@sbucaille
Copy link
Contributor Author

sbucaille commented Aug 24, 2023

Hi !

Ok, I have a first question regarding this specific model.
So SuperGlue is a keypoint matching model which, given keypoints, returns a matching of each of them.
But the keypoints this model relies on are given by another model, SuperPoint, which I introduced in the original issue.
SuperGlue as a model does not really make sense without SuperPoint, but also SuperPoint can be interpreted as a complete different model which given an image, returns the list of keypoints detected.
Moreover, SuperPoint is used in other models of the SoTA (like LightGlue, which is an evolution of SuperGlue, and which I also plan on implementing in transformers).

The question now : should I add the implementation of SuperPoint in the SuperGlue code and consider this combo SuperPoint + SuperGlue as the SuperGlue model in transformers (and also calling the class SuperGlueSuperPoint to fit the naming convention), or should I add SuperPoint as another model, standalone, part of transformers ?

I myself can't really tell what would be best since :

  • SuperGlue without SuperPoint (so considered standalone) is not really "usable", in practice it would require the user to have the keypoints itself but also from a test point of view, how can I verify the matching of SuperGlue without knowing what are the original images the keypoints are from ?
  • SuperPoint as a standalone model is useful since it provides the "image to keypoints" pipeline and is reused in other models like the aforementioned LightGlue. Also, I noticed as "First Good Issue" mentions of pipeline (like this one) and it gave me ideas of an image matching pipeline implementation such as "SuperPoint + SuperGlue" or "SuperPoint + LightGlue" or "DISK + LightGlue" (LightGlue was also tested with DISK which is another keypoint localizer) where 2 images are given and we obtain a matching.

Hope the question is clear and also since I'm new to all this "collaborating" thing on GitHub, let me know if this kind of questions should belong here or somewhere else.

Thanks again for considering my contribution !

EDIT : Also what is the difference between a model on the hub and a model added in the transformers library, I got confused by the existence of both this page and this page

@ArthurZucker
Copy link
Collaborator

cc @amyeroberts I need your take on this 😄

- Added the SuperGlueConfig
- Added the SuperGlueModel and its implementation
- Added basic weight conversion script
- Added new ImageMatchingOutput dataclass
@sbucaille
Copy link
Contributor Author

Hi,

In the meantime I added the basics for the implementation of SuperGlue by following the example of the tutorial you provided me earlier. I also looked around other models on how conversion scripts were implemented and mimicked it for the SuperGlue case.
Regardless of what is decided for the SuperPoint part, this code is the necessary minimum. It yet needs to be tested but without knowing what we should do with the SuperPoint part I preferred to stick what that.

@amyeroberts
Copy link
Contributor

@sbucaille Thanks for the detailed explanation about the two models! The model PR is a good place to ask questions about implementation :)

As both models, SuperPoint and SuperGlue offer new capabilities and are very popular, I would consider them good additions directly into transformers. However, as Arthur mentions, this would involve going through the PR review process which will be slower and more restrictive than adding directly onto the hub. If you want to go straight to the hub - you can decide how you would like to add the model!

If adding to transformers, what I would suggest is implementing SuperPoint as its own model and PR with task models e.g.SuperPointForInterestPointDescription (we can settle on a name later). I wouldn't add a separate MagicPoint model. In that PR, we can also add a mapping AutoModelForInterestPointDescription, which we define as taking two images and returning interest keypoints and their descriptions.

Then we can implement SuperGlue. Similar to e.g. MusicGen we can have SuperGlue load in any keypoint detection model using AutoModelForPointCorrespondence.

Then, if in the future you wanted to add DISK, SuperGlue could load either DISK or SuperPoint. Likewise, if you wanted to add LightGlue, it can then load DISK or SuperPoint using the same AutoModelForPointCorrespondence structure.

The important thing is that all the models being loaded using AutoModelForXxx have the same input / output structure.

@sbucaille
Copy link
Contributor Author

I created the PR for the SuperPoint implementation.
The main reason I'm doing this is to learn, so of course I am willing to go through the PR review process ! 😄

@huggingface huggingface deleted a comment from github-actions bot Sep 26, 2023
@github-actions
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot closed this Oct 30, 2023
@amyeroberts amyeroberts reopened this Oct 30, 2023
@github-actions github-actions bot closed this Nov 8, 2023
@amyeroberts amyeroberts reopened this Nov 8, 2023
@github-actions github-actions bot closed this Nov 17, 2023
@amyeroberts amyeroberts reopened this Nov 17, 2023
@github-actions github-actions bot closed this Nov 26, 2023
@amyeroberts
Copy link
Contributor

@sbucaille I'll leave this closed for now so we don't have to keep re-opening every week. I know you're off for a few weeks - just ping when you're back and I can reopen again then

@amyeroberts
Copy link
Contributor

@sbucaille Unfortunately I can't reopen the PR, as it says the branch has been force pushed or recreated.

@sbucaille
Copy link
Contributor Author

@amyeroberts That's correct I rebased the branch on the latest changes (with SuperPoint) and added the first changes a couple days ago, does that mean I should open a new PR ?

@amyeroberts
Copy link
Contributor

@sbucaille Yes please!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants