Skip to content

add CUB200 prototype datasets#5154

Merged
pmeier merged 3 commits intopytorch:mainfrom
pmeier:datasets/cub-200-2011
Jan 17, 2022
Merged

add CUB200 prototype datasets#5154
pmeier merged 3 commits intopytorch:mainfrom
pmeier:datasets/cub-200-2011

Conversation

@pmeier
Copy link
Copy Markdown
Contributor

@pmeier pmeier commented Jan 3, 2022

This is the only popular vision dataset (according to this list) that we are currently missing.

cc @pmeier @bjuncek

@facebook-github-bot
Copy link
Copy Markdown
Contributor

facebook-github-bot commented Jan 3, 2022

💊 CI failures summary and remediations

As of commit a3a7f12 (more details on the Dr. CI page):


None of the CI failures appear to be your fault 💚



🚧 1 ongoing upstream failure:

These were probably caused by upstream breakages that are not fixed yet.


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@vadimkantorov
Copy link
Copy Markdown

will also fix this: #1654

@vadimkantorov
Copy link
Copy Markdown

Another frequent dataset for metric learning is Stanford Online Products...

Copy link
Copy Markdown
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pmeier , some minor comments / questions but LGTM

Comment thread torchvision/prototype/datasets/_builtin/cub.py Outdated
Comment thread torchvision/prototype/datasets/_builtin/cub.py Outdated
bounding_box=BoundingBox(
[int(content["bbox"][coord]) for coord in ("left", "bottom", "right", "top")], format="xyxy"
),
segmentation=Feature(content["seg"]),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you help me understand why we use the decoder for the Feature(...) in _2011_decode_ann, but not in this function?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The decoder is used to turn an open file handle into a the pixel values as tensor. In this case the raw pixel values are already present as a numpy.ndarray so there is nothing left to decode.

@pmeier pmeier linked an issue Jan 17, 2022 that may be closed by this pull request
@pmeier
Copy link
Copy Markdown
Contributor Author

pmeier commented Jan 17, 2022

Test failure is unrelated.

@pmeier pmeier merged commit 28f72f1 into pytorch:main Jan 17, 2022
@pmeier pmeier deleted the datasets/cub-200-2011 branch January 17, 2022 08:02
@vadimkantorov
Copy link
Copy Markdown

vadimkantorov commented Jan 17, 2022

I guess it would be nice to include some okay metric learning tutorial / reference impl using CUB/Stanford Online Products, this could be a useful example of new-style dataset usage (I've had a bit simplified but working impl of https://arxiv.org/abs/1706.07567 in https://github.com/vadimkantorov/metriclearningbench)

facebook-github-bot pushed a commit that referenced this pull request Jan 17, 2022
Summary:
* add CUB200 prototype datasets

* address review comments

Reviewed By: NicolasHug

Differential Revision: D33618169

fbshipit-source-id: d2212728e21578eacdad9a186a5638750fbe3f79
@pmeier pmeier mentioned this pull request May 2, 2022
17 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[proposal] CUB 200-2011 dataset

4 participants