Our current implementation of query_chw requires an image somewhere in the sample, since that is the only way to extract the number of channels.
|
if isinstance(item, (features.Image, PIL.Image.Image)) or is_simple_tensor(item) |
However, there are quite a few transformations that only require the image size, i.e. height and width:
|
_, height, width = query_chw(sample) |
Although these transforms should technically work with BoundingBox'es and SegmentationMask's as well, they will fail at the moment.
I see two possible solutions to this:
-
Split query_chw into query_c and query_hw. The former will only work with images, while the latter works with with images as well as BoundingBox'es and SegmentationMask's. This was already implemented in a PR of mine, but I can't find it now. If someone does, feel free to link.
-
Option 1. requires us to go through the sample twice in case we need the number of channels and the image size. If we find that we need to reduce the number of times we do this, we could also allow query_chw to return None for the number of channels. In that case, I would introduce another flag need_c: bool = False that if set errors in case we don't find the number of channels. That would avoid much of duplicated error checking in the transformation like
c, h, w = query_chw(sample)
if c is None:
raise TypeError("I need number of channels, but found no image!")
in favor of
c, h, w = query_chw(sample, needs_c=True)
assert c is not None
cc @vfdev-5 @datumbox @bjuncek
Our current implementation of
query_chwrequires an image somewhere in the sample, since that is the only way to extract the number of channels.vision/torchvision/prototype/transforms/_utils.py
Line 39 in f9966d2
However, there are quite a few transformations that only require the image size, i.e. height and width:
vision/torchvision/prototype/transforms/_geometry.py
Line 95 in f9966d2
Although these transforms should technically work with
BoundingBox'es andSegmentationMask's as well, they will fail at the moment.I see two possible solutions to this:
Split
query_chwintoquery_candquery_hw. The former will only work with images, while the latter works with with images as well asBoundingBox'es andSegmentationMask's. This was already implemented in a PR of mine, but I can't find it now. If someone does, feel free to link.Option 1. requires us to go through the sample twice in case we need the number of channels and the image size. If we find that we need to reduce the number of times we do this, we could also allow
query_chwto returnNonefor the number of channels. In that case, I would introduce another flagneed_c: bool = Falsethat if set errors in case we don't find the number of channels. That would avoid much of duplicated error checking in the transformation likein favor of
cc @vfdev-5 @datumbox @bjuncek