[WIP] Implementation of SuperGlue by sbucaille · Pull Request #25697 · huggingface/transformers

sbucaille · 2023-08-23T19:55:44Z

What does this PR do?

This PR implements the SuperGlue model.

Who can review?

Todo's

ArthurZucker · 2023-08-24T05:52:48Z

Hey! Thanks a lot for contributing! 🚀 as a first step I would suggest you to upload the model on the hub following this tutorial, as it will make the process a lot easier, and won't need to go through our strict CIs! 🤗

sbucaille · 2023-08-24T21:18:54Z

Hi !

Ok, I have a first question regarding this specific model.
So SuperGlue is a keypoint matching model which, given keypoints, returns a matching of each of them.
But the keypoints this model relies on are given by another model, SuperPoint, which I introduced in the original issue.
SuperGlue as a model does not really make sense without SuperPoint, but also SuperPoint can be interpreted as a complete different model which given an image, returns the list of keypoints detected.
Moreover, SuperPoint is used in other models of the SoTA (like LightGlue, which is an evolution of SuperGlue, and which I also plan on implementing in transformers).

The question now : should I add the implementation of SuperPoint in the SuperGlue code and consider this combo SuperPoint + SuperGlue as the SuperGlue model in transformers (and also calling the class SuperGlueSuperPoint to fit the naming convention), or should I add SuperPoint as another model, standalone, part of transformers ?

I myself can't really tell what would be best since :

SuperGlue without SuperPoint (so considered standalone) is not really "usable", in practice it would require the user to have the keypoints itself but also from a test point of view, how can I verify the matching of SuperGlue without knowing what are the original images the keypoints are from ?
SuperPoint as a standalone model is useful since it provides the "image to keypoints" pipeline and is reused in other models like the aforementioned LightGlue. Also, I noticed as "First Good Issue" mentions of pipeline (like this one) and it gave me ideas of an image matching pipeline implementation such as "SuperPoint + SuperGlue" or "SuperPoint + LightGlue" or "DISK + LightGlue" (LightGlue was also tested with DISK which is another keypoint localizer) where 2 images are given and we obtain a matching.

Hope the question is clear and also since I'm new to all this "collaborating" thing on GitHub, let me know if this kind of questions should belong here or somewhere else.

Thanks again for considering my contribution !

EDIT : Also what is the difference between a model on the hub and a model added in the transformers library, I got confused by the existence of both this page and this page

ArthurZucker · 2023-08-25T06:58:34Z

cc @amyeroberts I need your take on this 😄

- Added the SuperGlueConfig - Added the SuperGlueModel and its implementation - Added basic weight conversion script - Added new ImageMatchingOutput dataclass

sbucaille · 2023-08-25T15:05:09Z

Hi,

In the meantime I added the basics for the implementation of SuperGlue by following the example of the tutorial you provided me earlier. I also looked around other models on how conversion scripts were implemented and mimicked it for the SuperGlue case.
Regardless of what is decided for the SuperPoint part, this code is the necessary minimum. It yet needs to be tested but without knowing what we should do with the SuperPoint part I preferred to stick what that.

amyeroberts · 2023-08-25T16:27:48Z

@sbucaille Thanks for the detailed explanation about the two models! The model PR is a good place to ask questions about implementation :)

As both models, SuperPoint and SuperGlue offer new capabilities and are very popular, I would consider them good additions directly into transformers. However, as Arthur mentions, this would involve going through the PR review process which will be slower and more restrictive than adding directly onto the hub. If you want to go straight to the hub - you can decide how you would like to add the model!

If adding to transformers, what I would suggest is implementing SuperPoint as its own model and PR with task models e.g.SuperPointForInterestPointDescription (we can settle on a name later). I wouldn't add a separate MagicPoint model. In that PR, we can also add a mapping AutoModelForInterestPointDescription, which we define as taking two images and returning interest keypoints and their descriptions.

Then we can implement SuperGlue. Similar to e.g. MusicGen we can have SuperGlue load in any keypoint detection model using AutoModelForPointCorrespondence.

Then, if in the future you wanted to add DISK, SuperGlue could load either DISK or SuperPoint. Likewise, if you wanted to add LightGlue, it can then load DISK or SuperPoint using the same AutoModelForPointCorrespondence structure.

The important thing is that all the models being loaded using AutoModelForXxx have the same input / output structure.

sbucaille · 2023-08-27T20:12:18Z

I created the PR for the SuperPoint implementation.
The main reason I'm doing this is to learn, so of course I am willing to go through the PR review process ! 😄

github-actions · 2023-10-21T08:05:22Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

amyeroberts · 2023-11-27T09:49:14Z

@sbucaille I'll leave this closed for now so we don't have to keep re-opening every week. I know you're off for a few weeks - just ping when you're back and I can reopen again then

amyeroberts · 2024-03-21T10:03:32Z

@sbucaille Unfortunately I can't reopen the PR, as it says the branch has been force pushed or recreated.

sbucaille · 2024-03-21T10:07:42Z

@amyeroberts That's correct I rebased the branch on the latest changes (with SuperPoint) and added the first changes a couple days ago, does that mean I should open a new PR ?

amyeroberts · 2024-03-21T11:00:42Z

@sbucaille Yes please!

Initial commit with template code generated by transformers-cli

5db1538

ArthurZucker added the Model on the Hub label Aug 24, 2023

Multiple additions to SuperGlue implementation :

460dc3b

- Added the SuperGlueConfig - Added the SuperGlueModel and its implementation - Added basic weight conversion script - Added new ImageMatchingOutput dataclass

sbucaille mentioned this pull request Aug 27, 2023

Implementation of SuperPoint and AutoModelForInterestPointDescription #25786

Closed

5 tasks

huggingface deleted a comment from github-actions bot Sep 26, 2023

github-actions bot closed this Oct 30, 2023

amyeroberts reopened this Oct 30, 2023

github-actions bot closed this Nov 8, 2023

amyeroberts reopened this Nov 8, 2023

github-actions bot closed this Nov 17, 2023

amyeroberts reopened this Nov 17, 2023

github-actions bot closed this Nov 26, 2023

sbucaille mentioned this pull request Feb 11, 2024

Implementation of SuperPoint and AutoModelForKeypointDetection #28966

Merged

5 tasks

sbucaille mentioned this pull request Mar 21, 2024

SuperPointModel -> SuperPointForKeypointDetection #29757

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Implementation of SuperGlue#25697

[WIP] Implementation of SuperGlue#25697
sbucaille wants to merge 2 commits intohuggingface:mainfrom
sbucaille:add_superglue

sbucaille commented Aug 23, 2023 •

edited

Loading

Uh oh!

ArthurZucker commented Aug 24, 2023

Uh oh!

sbucaille commented Aug 24, 2023 •

edited

Loading

Uh oh!

ArthurZucker commented Aug 25, 2023

Uh oh!

sbucaille commented Aug 25, 2023

Uh oh!

amyeroberts commented Aug 25, 2023

Uh oh!

sbucaille commented Aug 27, 2023

Uh oh!

github-actions bot commented Oct 21, 2023

Uh oh!

amyeroberts commented Nov 27, 2023

Uh oh!

amyeroberts commented Mar 21, 2024

Uh oh!

sbucaille commented Mar 21, 2024

Uh oh!

amyeroberts commented Mar 21, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

sbucaille commented Aug 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Who can review?

Todo's

Uh oh!

ArthurZucker commented Aug 24, 2023

Uh oh!

sbucaille commented Aug 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ArthurZucker commented Aug 25, 2023

Uh oh!

sbucaille commented Aug 25, 2023

Uh oh!

amyeroberts commented Aug 25, 2023

Uh oh!

sbucaille commented Aug 27, 2023

Uh oh!

github-actions bot commented Oct 21, 2023

Uh oh!

amyeroberts commented Nov 27, 2023

Uh oh!

amyeroberts commented Mar 21, 2024

Uh oh!

sbucaille commented Mar 21, 2024

Uh oh!

amyeroberts commented Mar 21, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sbucaille commented Aug 23, 2023 •

edited

Loading

sbucaille commented Aug 24, 2023 •

edited

Loading