Fuyu processing update by amyeroberts · Pull Request #27133 · huggingface/transformers

amyeroberts · 2023-10-29T19:44:03Z

What does this PR do?

This PR builds upon #27007 - ticking off some elements in the TODO list and bringing the processor and image processor more in-line with expected patterns in the library.

amyeroberts · 2023-10-29T19:46:45Z

@pcuenca Here's the draft PR for updating the image processor. In relation to your PR with the box coordinate transformations, you'll notice that I've removed the target_height and target_width attributes and have replaced them with the dictionary size. This is to reflect the pattern in other image processors.

amyeroberts · 2023-10-29T19:47:03Z

cc @molbap

HuggingFaceDocBuilderDev · 2023-10-29T20:06:02Z

The documentation is not available anymore as the PR was closed or merged.

pcuenca · 2023-10-29T23:03:41Z

@amyeroberts Nice! I'll update accordingly.

amyeroberts · 2023-10-30T10:58:35Z

src/transformers/models/fuyu/image_processing_fuyu.py

+        patches = patches.reshape(batch_size, -1, channels * patch_height * patch_width)
+        return patches
+
+    def preprocess_with_tokenizer_info(


This was renamed to preprocess_with_tokenizer_info to reflect the current naming patterns with other image processors: preprocess for creating the model inputs, post_process_xxx for processing the model outputs for a specific downstream task

Ok, good to know! thanks for the explanation

amyeroberts · 2023-10-30T10:59:50Z

src/transformers/models/fuyu/processing_fuyu.py

-# Copied from transformers.models.detr.image_processing_detr.max_across_indices
-def max_across_indices(values: Iterable[Any]) -> List[Any]:
-    """
-    Return the maximum value across all indices of an iterable of values.
-    """
-    return [max(values_i) for values_i in zip(*values)]
-
-
-# Copied from transformers.models.detr.image_processing_detr.get_max_height_width
-def get_max_height_width(
-    images: List[np.ndarray], input_data_format: Optional[Union[str, ChannelDimension]] = None
-) -> List[int]:
-    """
-    Get the maximum height and width across all images in a batch.
-    """
-    if input_data_format is None:
-        input_data_format = infer_channel_dimension_format(images[0])
-
-    if input_data_format == ChannelDimension.FIRST:
-        _, max_height, max_width = max_across_indices([img.shape for img in images])
-    elif input_data_format == ChannelDimension.LAST:
-        max_height, max_width, _ = max_across_indices([img.shape for img in images])
-    else:
-        raise ValueError(f"Invalid channel dimension format: {input_data_format}")
-    return (max_height, max_width)
-
-
-# Copied from transformers.models.detr.image_processing_detr.make_pixel_mask
-def make_pixel_mask(
-    image: np.ndarray, output_size: Tuple[int, int], input_data_format: Optional[Union[str, ChannelDimension]] = None
-) -> np.ndarray:
-    """
-    Make a pixel mask for the image, where 1 indicates a valid pixel and 0 indicates padding.
-
-    Args:
-        image (`np.ndarray`):
-            Image to make the pixel mask for.
-        output_size (`Tuple[int, int]`):
-            Output size of the mask.
-    """
-    input_height, input_width = get_image_size(image, channel_dim=input_data_format)
-    mask = np.zeros(output_size, dtype=np.int64)
-    mask[:input_height, :input_width] = 1
-    return mask


These were removed as they didn't appear to be used anywhere in the processing logic

amyeroberts · 2023-10-30T11:09:59Z

src/transformers/models/fuyu/processing_fuyu.py

-    return mask
-
-
-class FuyuBatchEncoding(BatchEncoding):


This was replaced wtih BatchFeature as the processor contains image_patches which are of float type

src/transformers/models/fuyu/image_processing_fuyu.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

src/transformers/models/fuyu/image_processing_fuyu.py

…rocessing-update-coordinates

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

…rocessing-update-coordinates

Co-authored-by: Pablo Montalvo <pablo.montalvo.leroux@gmail.com>

pcuenca

Some minor nits.

src/transformers/models/fuyu/image_processing_fuyu.py

pcuenca · 2023-10-31T18:05:48Z

src/transformers/models/fuyu/image_processing_fuyu.py

-        target_width: int = 1920,
+        do_resize: bool = True,
+        size: Optional[Dict[str, int]] = None,
+        resample: PILImageResampling = PILImageResampling.BILINEAR,  # FIXME check default value


Suggested change

resample: PILImageResampling = PILImageResampling.BILINEAR, # FIXME check default value

resample: PILImageResampling = PILImageResampling.BILINEAR,

This is how it was done in the original code: https://huggingface.co/adept-hf-collab/adept-mm/blob/736c6b570b2a9c0367a3266746fd1f53cfff0a2b/mm-inference-for-hf/multimodal/data/image_utils.py#L208

BILINEAR seems correct, as our resizing is always done on PIL images and antialias is True in that case.

src/transformers/models/fuyu/image_processing_fuyu.py

pcuenca · 2023-10-31T18:26:19Z

src/transformers/models/fuyu/processing_fuyu.py

+    if is_vision_available():
+        from .image_processing_fuyu import FuyuImageProcessor


Suggested change

if is_vision_available():

from .image_processing_fuyu import FuyuImageProcessor

from .image_processing_fuyu import FuyuImageProcessor

Otherwise I think import FuyuProcessor would fail if torchvision is not installed.

This helped uncover a bug! The image processor was being reset, overwriting the user's input here. If we get rid of that, then we don't need this import at all

Ohhhh, right!

src/transformers/models/fuyu/processing_fuyu.py

pcuenca · 2023-10-31T18:30:45Z

tests/models/fuyu/test_processing_fuyu.py

+        # Batch of two images - different sizes
+        images = [self.bus_image_pil, self.bus_image_pil.resize((64, 300))]
+        processor_outputs = self.processor(text=[self.text_prompt, self.text_prompt], images=images)
+        # FIXME - test outputs


To be completed, this succeeds now.

I've added a test which checks the processing of an individual resized images, and then checks the padding for two differently sized images in a batch

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

molbap · 2023-11-01T15:01:42Z

LGTM, I'll add some tests related to model in my PR! Ok to merge to #27007 when amyeroberts#113 is merged, and I'll add a model tester there

Fuyu processing: handle coordinates

* Fix Fuyu image scaling bug It could produce negative padding and hence inference errors for certain image sizes. * initial rework commit * add batching capabilities, refactor image processing * add functional batching for a list of images and texts * make args explicit * Fuyu processing update (#27133) * Add file headers * Add file headers * First pass - preprocess method with standard args * First pass image processor rework * Small tweaks * More args and docstrings * Tidying iterating over batch * Tidying up * Modify to have quick tests (for now) * Fix up * BatchFeature * Passing tests * Add tests for processor * Sense check when patchifying * Add some tests * FuyuBatchFeature * Post-process box coordinates * Update to `size` in processor * Remove unused and duplicate constants * Store unpadded dims after resize * Fix up * Return FuyuBatchFeature * Get unpadded sizes after resize * Update exception * Fix return * Convert input `<box>` coordinates to model format. * Post-process point coords, support multiple boxes/points in a single sequence * Replace constants * Update src/transformers/models/fuyu/image_processing_fuyu.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Preprocess List[List[image]] * Update src/transformers/models/fuyu/image_processing_fuyu.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update to Amy's latest state. * post-processing returns a list of tensors * Fix error when target_sizes is None Co-authored-by: Pablo Montalvo <pablo.montalvo.leroux@gmail.com> * Update src/transformers/models/fuyu/image_processing_fuyu.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update src/transformers/models/fuyu/image_processing_fuyu.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update src/transformers/models/fuyu/image_processing_fuyu.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update src/transformers/models/fuyu/image_processing_fuyu.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Review comments * Update src/transformers/models/fuyu/image_processing_fuyu.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Fix up * Fix up --------- Co-authored-by: Ubuntu <ubuntu@ip-172-31-72-126.ec2.internal> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Pablo Montalvo <pablo.montalvo.leroux@gmail.com> * Fix conflicts in fuyu_follow_up_image_processing (#27228) fixing conflicts and updating on main * Revert "Fix conflicts in fuyu_follow_up_image_processing" (#27232) Revert "Fix conflicts in fuyu_follow_up_image_processing (#27228)" This reverts commit acce10b. --------- Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-72-126.ec2.internal>

* Fix Fuyu image scaling bug It could produce negative padding and hence inference errors for certain image sizes. * initial rework commit * add batching capabilities, refactor image processing * add functional batching for a list of images and texts * make args explicit * Fuyu processing update (huggingface#27133) * Add file headers * Add file headers * First pass - preprocess method with standard args * First pass image processor rework * Small tweaks * More args and docstrings * Tidying iterating over batch * Tidying up * Modify to have quick tests (for now) * Fix up * BatchFeature * Passing tests * Add tests for processor * Sense check when patchifying * Add some tests * FuyuBatchFeature * Post-process box coordinates * Update to `size` in processor * Remove unused and duplicate constants * Store unpadded dims after resize * Fix up * Return FuyuBatchFeature * Get unpadded sizes after resize * Update exception * Fix return * Convert input `<box>` coordinates to model format. * Post-process point coords, support multiple boxes/points in a single sequence * Replace constants * Update src/transformers/models/fuyu/image_processing_fuyu.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Preprocess List[List[image]] * Update src/transformers/models/fuyu/image_processing_fuyu.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update to Amy's latest state. * post-processing returns a list of tensors * Fix error when target_sizes is None Co-authored-by: Pablo Montalvo <pablo.montalvo.leroux@gmail.com> * Update src/transformers/models/fuyu/image_processing_fuyu.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update src/transformers/models/fuyu/image_processing_fuyu.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update src/transformers/models/fuyu/image_processing_fuyu.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update src/transformers/models/fuyu/image_processing_fuyu.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Review comments * Update src/transformers/models/fuyu/image_processing_fuyu.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Fix up * Fix up --------- Co-authored-by: Ubuntu <ubuntu@ip-172-31-72-126.ec2.internal> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Pablo Montalvo <pablo.montalvo.leroux@gmail.com> * Fix conflicts in fuyu_follow_up_image_processing (huggingface#27228) fixing conflicts and updating on main * Revert "Fix conflicts in fuyu_follow_up_image_processing" (huggingface#27232) Revert "Fix conflicts in fuyu_follow_up_image_processing (huggingface#27228)" This reverts commit acce10b. --------- Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-72-126.ec2.internal>

amyeroberts and others added 16 commits October 25, 2023 16:27

Add file headers

7bbec19

Add file headers

869e38f

First pass - preprocess method with standard args

90d766d

First pass image processor rework

aa26da6

Small tweaks

63511e9

More args and docstrings

408419c

Tidying iterating over batch

34af3c3

Tidying up

606e3b1

Modify to have quick tests (for now)

5ed4b49

Fix up

8f6b7dd

BatchFeature

854cfee

Passing tests

6e60e2e

Add tests for processor

6230b7e

Sense check when patchifying

930f74f

Add some tests

cf8f878

FuyuBatchFeature

df0fe2f

pcuenca added 3 commits October 30, 2023 09:03

Post-process box coordinates

9fad46a

Update to size in processor

7fa2484

Remove unused and duplicate constants

5cb8639

amyeroberts commented Oct 30, 2023

View reviewed changes

pcuenca and others added 4 commits October 30, 2023 13:42

Store unpadded dims after resize

8799e21

Fix up

08a3179

Return FuyuBatchFeature

2cd0222

Get unpadded sizes after resize

978093e

pcuenca reviewed Oct 30, 2023

View reviewed changes

src/transformers/models/fuyu/image_processing_fuyu.py Outdated Show resolved Hide resolved

amyeroberts mentioned this pull request Oct 31, 2023

_clamp_coord in FuyuProcessor was not defined #27168

Closed

4 tasks

amyeroberts and others added 2 commits October 31, 2023 12:11

Update src/transformers/models/fuyu/image_processing_fuyu.py

a2e8090

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

Preprocess List[List[image]]

e2b723f

pcuenca reviewed Oct 31, 2023

View reviewed changes

src/transformers/models/fuyu/image_processing_fuyu.py Outdated Show resolved Hide resolved

pcuenca and others added 6 commits October 31, 2023 15:57

Merge remote-tracking branch 'amy/fuyu-processing-update' into fuyu-p…

bfa2797

…rocessing-update-coordinates

Update src/transformers/models/fuyu/image_processing_fuyu.py

6335921

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

Merge remote-tracking branch 'amy/fuyu-processing-update' into fuyu-p…

44f65a8

…rocessing-update-coordinates

Update to Amy's latest state.

1f3b133

post-processing returns a list of tensors

f87d45d

Fix error when target_sizes is None

2fb0ef8

Co-authored-by: Pablo Montalvo <pablo.montalvo.leroux@gmail.com>

pcuenca reviewed Oct 31, 2023

View reviewed changes

amyeroberts and others added 7 commits November 1, 2023 10:39

Update src/transformers/models/fuyu/image_processing_fuyu.py

f661909

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

Update src/transformers/models/fuyu/image_processing_fuyu.py

1b48d5e

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

Update src/transformers/models/fuyu/image_processing_fuyu.py

da3f5c7

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

Update src/transformers/models/fuyu/image_processing_fuyu.py

b637da8

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

Review comments

36a3fde

Update src/transformers/models/fuyu/image_processing_fuyu.py

b4f501d

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

Fix up

61d0ed4

amyeroberts changed the title ~~WIP: Fuyu processing update~~ Fuyu processing update Nov 1, 2023

molbap mentioned this pull request Nov 1, 2023

Fuyu: improve image processing #27007

Merged

9 tasks

amyeroberts mentioned this pull request Nov 1, 2023

[Fuyu] Add tests #27001

Merged

1 task

amyeroberts added 2 commits November 1, 2023 18:00

Merge pull request #113 from pcuenca/fuyu-processing-update-coordinates

c523215

Fuyu processing: handle coordinates

Fix up

5bf76cd

amyeroberts merged commit 584f792 into huggingface:fuyu_follow_up_image_processing Nov 1, 2023

amyeroberts deleted the fuyu-processing-update branch November 1, 2023 18:53

	resample: PILImageResampling = PILImageResampling.BILINEAR, # FIXME check default value
	resample: PILImageResampling = PILImageResampling.BILINEAR,

		if is_vision_available():
		from .image_processing_fuyu import FuyuImageProcessor

Conversation

amyeroberts commented Oct 29, 2023

What does this PR do?

Uh oh!

amyeroberts commented Oct 29, 2023

Uh oh!

amyeroberts commented Oct 29, 2023

Uh oh!

HuggingFaceDocBuilderDev commented Oct 29, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pcuenca commented Oct 29, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

pcuenca left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

molbap commented Nov 1, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

HuggingFaceDocBuilderDev commented Oct 29, 2023 •

edited

Loading