Skip to content

Train on responses only does not work with TinyLlama-chat #1015

@akhlakm

Description

@akhlakm

The following error occurs while using train_on_responses_only on the unsloth/tinyllama-chat-bnb-4bit model.

/usr/local/lib/python3.10/dist-packages/unsloth/chat_templates.py in <listcomp>(.0)
   1714     substring = _longest_common_substring([str(x + [0]) for x in all_input_ids])
   1715     substring = substring.split(", ")[:-1]
-> 1716     substring = [int(x) for x in substring]
   1717 
   1718     # Also get rest of tokenized string

ValueError: invalid literal for int() with base 10: ''

Link to the test notebook: https://colab.research.google.com/gist/akhlakm/c7c40b0c29d112f2544168be42d3410b/llama-3-1-8b-conversational-unsloth-2x-faster-finetuning.ipynb

Also, when the chat template defined in the tokenizer_config.json file is used, I get the following error if train_on_responses_only is used.

trainer_stats = trainer.train()
                    ^^^^^^^^^^^^^^^
  File "<string>", line 145, in train
  File "<string>", line 320, in _fast_inner_training_loop
  File "/home/user/unsloth_env/lib/python3.11/site-packages/accelerate/data_loader.py", line 550, in __iter__
    current_batch = next(dataloader_iter)
                    ^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/unsloth_env/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 630, in __next__
    data = self._next_data()
           ^^^^^^^^^^^^^^^^^
  File "/home/user/unsloth_env/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 673, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/unsloth_env/lib/python3.11/site-packages/torch/utils/data/_utils/fetch.py", line 55, in fetch
    return self.collate_fn(data)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/unsloth_env/lib/python3.11/site-packages/transformers/data/data_collator.py", line 45, in __call__
    return self.torch_call(features)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/unsloth_env/lib/python3.11/site-packages/transformers/data/data_collator.py", line 806, in torch_call
    batch = pad_without_fast_tokenizer_warning(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/unsloth_env/lib/python3.11/site-packages/transformers/data/data_collator.py", line 66, in pad_without_fast_tokenizer_warning
    padded = tokenizer.pad(*pad_args, **pad_kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/unsloth_env/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 3560, in pad
    return BatchEncoding(batch_outputs, tensor_type=return_tensors)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/unsloth_env/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 227, in __init__
    self.convert_to_tensors(tensor_type=tensor_type, prepend_batch_axis=prepend_batch_axis)
  File "/home/user/unsloth_env/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 778, in convert_to_tensors
    raise ValueError(
ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (`labels` in this case) have excessive nesting (inputs type `list` where type `int` is expected).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions