Remove unnecessary `view`s of `position_ids` by ramiro050 · Pull Request #26059 · huggingface/transformers

ramiro050 · 2023-09-08T18:54:09Z

When position_ids is None, its value is generated using torch.arange, which creates a tensor of size (seq_length + initial_val) - initial_val = seq_length. The tensor is then unsqueezed, resulting in a tensor of shape (1, seq_length). This means that the last view to a tensor of shape (-1, seq_length) is a no-op.

Simiarly, when position_ids is not None, the documentation for the models specifies that it should be of shape (batch_size, seq_length), making the view made in this case also unnecessary.

This commit removes the unnecessary views.

@gante @ArthurZucker

ramiro050 · 2023-09-08T18:54:22Z

For context, I work in the Torch-MLIR project, which converts PyTorch models into MLIR. Handling the view op in its most general form (especially the use of -1s) has been a big challenge for us for a while now (there's some work we need to do to improve the shape information propagation and dynamic shape support in our project). While the goal is to support this as is in the future, in the short term, removing this view from the computation would unblock us to run LLaMA in our compiler.

ArthurZucker

Fine with me, could you confirm that the slow tests pass using RUN_SLOW pytest tests/models/llama ?

ArthurZucker · 2023-09-08T20:02:55Z

pinging @gante for a second look

HuggingFaceDocBuilderDev · 2023-09-08T20:19:32Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

ramiro050 · 2023-09-08T21:15:21Z

Running RUN_SLOW=1 pytest tests/models/llama, I get 172 passed, 47 skipped, 35 warnings. The warnings are either due to deprecation or from using PyTorch's tracer; they don't seem related to the changes in this PR. The skipped tests are these:

test_modeling_llama.py::LlamaModelTest::test_cpu_offload SKIPPED (test requires CUDA)
test_modeling_llama.py::LlamaModelTest::test_disk_offload SKIPPED (test requires CUDA)
test_modeling_llama.py::LlamaModelTest::test_equivalence_flax_to_pt SKIPPED (test is PT+FLAX test)
test_modeling_llama.py::LlamaModelTest::test_equivalence_pt_to_flax SKIPPED (test is PT+FLAX test)
test_modeling_llama.py::LlamaModelTest::test_model_parallel_beam_search SKIPPED (test requires multiple GPUs)
test_modeling_llama.py::LlamaModelTest::test_model_parallel_equal_results SKIPPED (test requires multiple GPUs)
test_modeling_llama.py::LlamaModelTest::test_model_parallelism SKIPPED (test requires multiple GPUs)
test_modeling_llama.py::LlamaModelTest::test_model_parallelization SKIPPED (test requires multiple GPUs)
test_modeling_llama.py::LlamaModelTest::test_multi_gpu_data_parallel_forward SKIPPED (test requires multiple GPUs)
test_modeling_llama.py::LlamaModelTest::test_pipeline_audio_classification SKIPPED (LlamaModelTest::test_pipeline_audio_classification is skipped: `audio-classification` is not in `self.pipeline_model_mapping` for `...)
test_modeling_llama.py::LlamaModelTest::test_pipeline_automatic_speech_recognition SKIPPED (LlamaModelTest::test_pipeline_automatic_speech_recognition is skipped: `automatic-speech-recognition` is not in `self.pipel...)
test_modeling_llama.py::LlamaModelTest::test_pipeline_conversational SKIPPED (LlamaModelTest::test_pipeline_conversational is skipped: `conversational` is not in `self.pipeline_model_mapping` for `LlamaModelTest`.)
test_modeling_llama.py::LlamaModelTest::test_pipeline_depth_estimation SKIPPED (LlamaModelTest::test_pipeline_depth_estimation is skipped: `depth-estimation` is not in `self.pipeline_model_mapping` for `LlamaModelTe...)
test_modeling_llama.py::LlamaModelTest::test_pipeline_document_question_answering SKIPPED (LlamaModelTest::test_pipeline_document_question_answering is skipped: `document-question-answering` is not in `self.pipeline...)
test_modeling_llama.py::LlamaModelTest::test_pipeline_fill_mask SKIPPED (LlamaModelTest::test_pipeline_fill_mask is skipped: `fill-mask` is not in `self.pipeline_model_mapping` for `LlamaModelTest`.)
test_modeling_llama.py::LlamaModelTest::test_pipeline_image_classification SKIPPED (LlamaModelTest::test_pipeline_image_classification is skipped: `image-classification` is not in `self.pipeline_model_mapping` for `...)
test_modeling_llama.py::LlamaModelTest::test_pipeline_image_segmentation SKIPPED (LlamaModelTest::test_pipeline_image_segmentation is skipped: `image-segmentation` is not in `self.pipeline_model_mapping` for `LlamaM...)
test_modeling_llama.py::LlamaModelTest::test_pipeline_image_to_text SKIPPED (LlamaModelTest::test_pipeline_image_to_text is skipped: `image-to-text` is not in `self.pipeline_model_mapping` for `LlamaModelTest`.)
test_modeling_llama.py::LlamaModelTest::test_pipeline_mask_generation SKIPPED (`run_pipeline_test` is currently not implemented.)
test_modeling_llama.py::LlamaModelTest::test_pipeline_object_detection SKIPPED (LlamaModelTest::test_pipeline_object_detection is skipped: `object-detection` is not in `self.pipeline_model_mapping` for `LlamaModelTe...)
test_modeling_llama.py::LlamaModelTest::test_pipeline_question_answering SKIPPED (LlamaModelTest::test_pipeline_question_answering is skipped: `question-answering` is not in `self.pipeline_model_mapping` for `LlamaM...)
test_modeling_llama.py::LlamaModelTest::test_pipeline_summarization SKIPPED (LlamaModelTest::test_pipeline_summarization is skipped: `summarization` is not in `self.pipeline_model_mapping` for `LlamaModelTest`.)
test_modeling_llama.py::LlamaModelTest::test_pipeline_table_question_answering SKIPPED (LlamaModelTest::test_pipeline_table_question_answering is skipped: `table-question-answering` is not in `self.pipeline_model_ma...)
test_modeling_llama.py::LlamaModelTest::test_pipeline_text2text_generation SKIPPED (LlamaModelTest::test_pipeline_text2text_generation is skipped: `text2text-generation` is not in `self.pipeline_model_mapping` for `...)
test_modeling_llama.py::LlamaModelTest::test_pipeline_text_to_audio SKIPPED (LlamaModelTest::test_pipeline_text_to_audio is skipped: `text-to-audio` is not in `self.pipeline_model_mapping` for `LlamaModelTest`.)
test_modeling_llama.py::LlamaModelTest::test_pipeline_token_classification SKIPPED (LlamaModelTest::test_pipeline_token_classification is skipped: `token-classification` is not in `self.pipeline_model_mapping` for `...)
test_modeling_llama.py::LlamaModelTest::test_pipeline_translation SKIPPED (LlamaModelTest::test_pipeline_translation is skipped: `translation` is not in `self.pipeline_model_mapping` for `LlamaModelTest`.)
test_modeling_llama.py::LlamaModelTest::test_pipeline_video_classification SKIPPED (LlamaModelTest::test_pipeline_video_classification is skipped: `video-classification` is not in `self.pipeline_model_mapping` for `...)
test_modeling_llama.py::LlamaModelTest::test_pipeline_visual_question_answering SKIPPED (LlamaModelTest::test_pipeline_visual_question_answering is skipped: `visual-question-answering` is not in `self.pipeline_model...)
test_modeling_llama.py::LlamaModelTest::test_pipeline_zero_shot_audio_classification SKIPPED (LlamaModelTest::test_pipeline_zero_shot_audio_classification is skipped: `zero-shot-audio-classification` is not in `self...)
test_modeling_llama.py::LlamaModelTest::test_pipeline_zero_shot_image_classification SKIPPED (LlamaModelTest::test_pipeline_zero_shot_image_classification is skipped: `zero-shot-image-classification` is not in `self...)
test_modeling_llama.py::LlamaModelTest::test_pipeline_zero_shot_object_detection SKIPPED (LlamaModelTest::test_pipeline_zero_shot_object_detection is skipped: `zero-shot-object-detection` is not in `self.pipeline_mo...)
test_modeling_llama.py::LlamaModelTest::test_pt_tf_model_equivalence SKIPPED (test is PT+TF test)
test_modeling_llama.py::LlamaModelTest::test_save_load_fast_init_from_base SKIPPED (LLaMA buffers include complex numbers, which breaks this test)
test_modeling_llama.py::LlamaIntegrationTest::test_model_13b_greedy_generation SKIPPED (Model is curently gated)
test_modeling_llama.py::LlamaIntegrationTest::test_model_13b_logits SKIPPED (Logits are not exactly the same, once we fix the instabalities somehow, will update!)
test_modeling_llama.py::LlamaIntegrationTest::test_model_13bf_logits SKIPPED (Logits are not exactly the same, once we fix the instabalities somehow, will update!)
test_modeling_llama.py::LlamaIntegrationTest::test_model_70b_logits SKIPPED (Logits are not exactly the same, once we fix the instabalities somehow, will update! Also it is gonna be a `too_slow` test)
test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits SKIPPED (Logits are not exactly the same, once we fix the instabalities somehow, will update!)
test_modeling_llama.py::CodeLlamaIntegrationTest::test_model_7b_logits SKIPPED (test requires CUDA)
test_tokenization_llama.py::LlamaTokenizationTest::test_batch_encode_plus_tensors SKIPPED (test is PT+TF test)
test_tokenization_llama.py::LlamaTokenizationTest::test_pickle_subword_regularization_tokenizer SKIPPED (worker 'gw4' crashed on CI, passing locally.)
test_tokenization_llama.py::LlamaTokenizationTest::test_save_pretrained SKIPPED (Let's wait for the fast tokenizer!)
test_tokenization_llama.py::LlamaTokenizationTest::test_save_slow_from_fast_and_reload_fast SKIPPED (Unfortunately way too slow to build a BPE with SentencePiece.)
test_tokenization_llama.py::LlamaTokenizationTest::test_subword_regularization_tokenizer SKIPPED (worker 'gw4' crashed on CI, passing locally.)
test_tokenization_llama.py::LlamaTokenizationTest::test_tf_encode_plus_sent_to_model SKIPPED (test requires TensorFlow)
test_tokenization_llama.py::LlamaIntegrationTest::test_integration_test_xnli SKIPPED (RUN_TOKENIZER_INTEGRATION=1 to run tokenizer integration tests)

ramiro050 · 2023-09-14T23:26:19Z

@gante, bumping this :)

ArthurZucker · 2023-09-15T14:34:19Z

He is off for a while so won't be able to review this! le me ping @fxmarty for a second look! 👁️

fxmarty

LGTM, saw the same thing. I believe other architectures have the same issue, e.g.

transformers/src/transformers/models/gpt_neo/modeling_gpt_neo.py

Line 553 in 2518e36

position_ids = position_ids.unsqueeze(0).view(-1, input_shape[-1])

. It could be nice to fix it in this PR as well. Some that may be impacted: codgen, gpt2, gpt_neo, gpt_neox, gptj, gpt_bigcode, imagegpt, llama.

@ramiro050 Beware that this code path if position_ids is None: should not be trusted for batched generation, as it assumes that all generated sequences have the same length.

src/transformers/models/llama/modeling_llama.py

ramiro050 · 2023-09-20T20:05:36Z

It could be nice to fix it in this PR as well. Some that may be impacted: codgen, gpt2, gpt_neo, gpt_neox, gptj, gpt_bigcode, imagegpt, llama.

Done! PTAL

When `position_ids` is `None`, its value is generated using `torch.arange`, which creates a tensor of size `(seq_length + past_key_values_length) - past_key_values_length = seq_length`. The tensor is then unsqueezed, resulting in a tensor of shape `(1, seq_length)`. This means that the last `view` to a tensor of shape `(-1, seq_length)` is a no-op. This commit removes the unnecessary view.

ramiro050 · 2023-09-25T22:00:53Z

@ArthurZucker @fxmarty, bumping this

ArthurZucker

Hey! Just wondering why we remove the casting to long which I think is required for backward compatibility!

ArthurZucker · 2023-09-28T08:34:35Z

src/transformers/models/codegen/modeling_codegen.py

-        if position_ids is not None:
-            position_ids = position_ids.view(-1, input_shape[-1]).long()
-


hey! Not sure why this was removed as well? Casting to long is probably required / might be breaking if removed!

The docstring mentions that position_ids should be a tensor of type torch.LongTensor, so there should not be a need for the cast. (position_ids is also marked as having type Optional[torch.LongTensor] in the signature of forward)

ramiro050 · 2023-10-03T16:39:52Z

@ArthurZucker, bumping this

fxmarty

LGTM

ArthurZucker

Thanks for the fix

ArthurZucker approved these changes Sep 8, 2023

View reviewed changes

ArthurZucker requested a review from gante September 8, 2023 20:02

ArthurZucker requested a review from fxmarty September 15, 2023 14:34

fxmarty approved these changes Sep 15, 2023

View reviewed changes

fxmarty reviewed Sep 15, 2023

View reviewed changes

src/transformers/models/llama/modeling_llama.py Outdated Show resolved Hide resolved

ramiro050 requested review from ArthurZucker and fxmarty September 20, 2023 20:05

ramiro050 changed the title ~~Remove unnecessary view of position_ids in modeling_llama~~ Remove unnecessary views of position_ids Sep 20, 2023

ramiro050 force-pushed the llama-remove-view branch from 8ce4d78 to 8d3b4a7 Compare September 20, 2023 20:28

Remove no-op view of position_ids in rest of transformer models

a0887a6

ramiro050 force-pushed the llama-remove-view branch from 8d3b4a7 to a0887a6 Compare September 20, 2023 20:48

ArthurZucker reviewed Sep 28, 2023

View reviewed changes

fxmarty approved these changes Oct 6, 2023

View reviewed changes

ArthurZucker approved these changes Oct 6, 2023

View reviewed changes

ArthurZucker merged commit 8878eb1 into huggingface:main Oct 6, 2023

ramiro050 deleted the llama-remove-view branch October 9, 2023 18:09

dan-garvey mentioned this pull request Feb 7, 2024

More TorchToLinalg view failure cases llvm/torch-mlir#2877

Open

		if position_ids is not None:
		position_ids = position_ids.view(-1, input_shape[-1]).long()

Conversation

ramiro050 commented Sep 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ramiro050 commented Sep 8, 2023

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

ArthurZucker commented Sep 8, 2023

Uh oh!

HuggingFaceDocBuilderDev commented Sep 8, 2023

Uh oh!

ramiro050 commented Sep 8, 2023

Uh oh!

ramiro050 commented Sep 14, 2023

Uh oh!

ArthurZucker commented Sep 15, 2023

Uh oh!

fxmarty left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ramiro050 commented Sep 20, 2023

Uh oh!

ramiro050 commented Sep 25, 2023

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Sep 28, 2023

Choose a reason for hiding this comment

Uh oh!

ramiro050 Sep 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ramiro050 commented Oct 3, 2023

Uh oh!

fxmarty left a comment

Choose a reason for hiding this comment

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ramiro050 commented Sep 8, 2023 •

edited

Loading

ramiro050 Sep 28, 2023 •

edited

Loading