CI fails with dev dependencies: TypeError: 'NoneType' object is not subscriptable

CI fails with dev dependencies: https://github.com/huggingface/trl/actions/runs/18493152127/job/52691262212
> TypeError: 'NoneType' object is not subscriptable
```python
FAILED tests/test_modeling_geometric_mixture_wrapper.py::TestGeometricMixtureWrapper::test_prepare_inputs_for_generation - TypeError: 'NoneType' object is not subscriptable
```

Stacktrace:
```python
>       inputs = self.wrapper.prepare_inputs_for_generation(input_ids, attention_mask=attention_mask, use_cache=True)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

tests/test_modeling_geometric_mixture_wrapper.py:65: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
trl/models/modeling_base.py:717: in prepare_inputs_for_generation
    model_inputs = self.model.prepare_inputs_for_generation(*args, **kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.12/site-packages/transformers/generation/utils.py:613: in prepare_inputs_for_generation
    inputs_embeds, input_ids = self._cache_dependant_input_preparation(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = Qwen2ForCausalLM(
  (model): Qwen2Model(
    (embed_tokens): Embedding(151665, 8)
    (layers): ModuleList(
      (0-1...-06)
    (rotary_emb): Qwen2RotaryEmbedding()
  )
  (lm_head): Linear(in_features=8, out_features=151665, bias=False)
)
input_ids = tensor([[1, 2, 3, 4, 5]], device='cuda:0'), inputs_embeds = None
cache_position = None

    def _cache_dependant_input_preparation(
        self,
        input_ids: torch.LongTensor,
        inputs_embeds: Optional[torch.FloatTensor],
        cache_position: Optional[torch.LongTensor],
    ) -> tuple[torch.FloatTensor, torch.LongTensor]:
        """
        Generic cache-dependent input preparation
        The code is put in a separate function to allow granular unit testing
        as it needs a different implementation to be exportable.
    
        If we have cache: let's slice `input_ids` through `cache_position`, to keep only the unprocessed tokens
        - Exception 1: when passing input_embeds, input_ids may be missing entries
        - Exception 2: some generation methods do special slicing of input_ids, so we don't need to do it here
        - Exception 3: with synced GPUs cache_position may go out of bounds, but we only want dummy token in that case.
        - Exception 4: If input_embeds are passed then slice it through `cache_position`, to keep only the unprocessed tokens and
          generate the first token for each sequence. Later use the generated Input ids for continuation.
    
        The current implementation does not rely on ``self`` and could be
        a class method. It is left as a standard method to be easily rewritten.
        """
        if is_torchdynamo_exporting():
            return self._cache_dependant_input_preparation_exporting(input_ids, inputs_embeds, cache_position)
        if inputs_embeds is not None and input_ids.shape[1] == 0:  # Exception 4
            inputs_embeds = inputs_embeds[:, -cache_position.shape[0] :]
        elif (
            inputs_embeds is not None  # Exception 1
>           or (cache_position[-1] >= input_ids.shape[1])  # Exception 3
                ^^^^^^^^^^^^^^^^^^
        ):
E       TypeError: 'NoneType' object is not subscriptable

.venv/lib/python3.12/site-packages/transformers/generation/utils.py:509: TypeError
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI fails with dev dependencies: TypeError: 'NoneType' object is not subscriptable #4272

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

CI fails with dev dependencies: TypeError: 'NoneType' object is not subscriptable #4272

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions