Add --backend and --use-v2 support to detection refs#7732
Add --backend and --use-v2 support to detection refs#7732NicolasHug merged 7 commits intopytorch:mainfrom
Conversation
| use_v2=False, | ||
| ): | ||
| module = get_module(use_v2) | ||
| T = get_module(use_v2) |
There was a problem hiding this comment.
I just did s/module/T/ in the file to make it consistent with the detection one
| target_size: Tuple[int, int], | ||
| scale_range: Tuple[float, float] = (0.1, 2.0), | ||
| interpolation: InterpolationMode = InterpolationMode.BILINEAR, | ||
| antialias=True, |
There was a problem hiding this comment.
Had to add antialias support because it'd be False otherwise by default for tensors. There's no BC requirements so we could just hard-code antialias=True below in the calls to resize() instead of adding a parameter here, but it doesn't change much. LMK what you prefer.
There was a problem hiding this comment.
IDC. Unless there is some other opinion, let's keep it the way it is.
| t.append(transforms) | ||
| transforms = T.Compose(t) | ||
|
|
||
| dataset = CocoDetection(img_folder, ann_file, transforms=transforms) |
There was a problem hiding this comment.
I wonder if we could get rid of this custom CocoDetection dataset here. Ideally we would always call wrap_dataset_for_transforms_v2 and just "unwrap" the datapoints classes into pure-tensors etc...? But we can't use it without silencing the V2 warning first :/
Not sure what to do to clean that up.
| return True | ||
| return False | ||
|
|
||
| if not isinstance(dataset, torchvision.datasets.CocoDetection): |
There was a problem hiding this comment.
Instead of removing this (seemingly useless check) I could just add the same workaround as elsewhere i.e. add
of isinstance(
getattr(dataset, "_dataset", None), torchvision.datasets.CocoDetection
):There was a problem hiding this comment.
We still have #7239. Maybe we should go at it again?
| model_time = time.time() - model_time | ||
|
|
||
| res = {target["image_id"].item(): output for target, output in zip(targets, outputs)} | ||
| res = {target["image_id"]: output for target, output in zip(targets, outputs)} |
There was a problem hiding this comment.
This is for consistency with the V2 wrapper which leaves image_id as an int. In our references we used to manually wrap it into a tensor (why, IDK), and I removed that as well below in coco_utils
pmeier
left a comment
There was a problem hiding this comment.
LGTM, thanks Nicolas! I'm ok with not testing it on all configurations right now, but to make sure: you have tested it on least one and it works, correct?
| return True | ||
| return False | ||
|
|
||
| if not isinstance(dataset, torchvision.datasets.CocoDetection): |
There was a problem hiding this comment.
We still have #7239. Maybe we should go at it again?
|
|
||
|
|
||
| def get_coco_api_from_dataset(dataset): | ||
| # FIXME: This is... awful? |
There was a problem hiding this comment.
Yeah. Happy for you to address it here, but not required.
There was a problem hiding this comment.
I would if I knew what to do lol. (I'm gonna leave this out for now I think)
| for images, targets in metric_logger.log_every(data_loader, print_freq, header): | ||
| images = list(image.to(device) for image in images) | ||
| targets = [{k: v.to(device) for k, v in t.items()} for t in targets] | ||
| targets = [{k: v.to(device) if isinstance(v, torch.Tensor) else v for k, v in t.items()} for t in targets] |
There was a problem hiding this comment.
This is for the image ID, right?
| # TODO: FixedSizeCrop below doesn't work on tensors! | ||
| reference_transforms.FixedSizeCrop(size=(1024, 1024), fill=mean), |
There was a problem hiding this comment.
In v2 we have RandomCrop that does what FixedSizedCrop does minus the clamping and sanitizing bounding boxes.
| if use_v2: | ||
| transforms += [ | ||
| T.ConvertBoundingBoxFormat(datapoints.BoundingBoxFormat.XYXY), | ||
| T.SanitizeBoundingBox(), |
There was a problem hiding this comment.
Do we also need ClampBoundingBox here?
There was a problem hiding this comment.
I don't think so since we established that all transforms should clamp already (those that need to, at least)?
references/detection/train.py
Outdated
|
|
||
| dataset, num_classes = get_dataset(args.dataset, "train", get_transform(True, args), args.data_path) | ||
| dataset_test, _ = get_dataset(args.dataset, "val", get_transform(False, args), args.data_path) | ||
| dataset, num_classes = get_dataset(args.dataset, "train", get_transform(True, args), args.data_path, args.use_v2) |
There was a problem hiding this comment.
Not required here, but can we maybe use keyword args here? The call is really hard to parse.
| target_size: Tuple[int, int], | ||
| scale_range: Tuple[float, float] = (0.1, 2.0), | ||
| interpolation: InterpolationMode = InterpolationMode.BILINEAR, | ||
| antialias=True, |
There was a problem hiding this comment.
IDC. Unless there is some other opinion, let's keep it the way it is.
|
Test failures are real. Maybe related to |
|
Thanks for the review! Which failure are relevant? We have no tests for the |
We do. We have v2 consistency tests that checks the transforms that we have added to our package against the stuff that we have in our references:
I've send a fix in 72da655. |
Yeah, I tested it on a few combinations, but I wouldn't be surprised if there's a few edge-cases I missed. We'll find out soon enough. Thanks for the review! |
|
Hey @NicolasHug! You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py |
Summary: Co-authored-by: Philip Meier <github.pmeier@posteo.de> Reviewed By: matteobettini Differential Revision: D48642258 fbshipit-source-id: 7d99fb2ea5effde79ee59d259f902fcf145ae64c
This probably won't be an easy review as there's a bunch of renaming, sorry.
There are a few things I've intentionally left out of this PR:
It's also fairly likely that I broke some stuff in the process or that not everything works 100% correctly straight away. It's very hard to test everything considering we have no unit test, and the cross-product of all combinations is high. It shouldn't be too critical though, as we'll be addressing any outstanding issue in the near future as we run more training jobs.