-
Notifications
You must be signed in to change notification settings - Fork 32.5k
Closed
Description
System Info
The system is google colab, transformers related packages are installed from git.
- `transformers` version: 4.32.0.dev0
- Platform: Linux-5.15.109+-x86_64-with-glibc2.35
- Python version: 3.10.6
- Huggingface_hub version: 0.16.4
- Safetensors version: 0.3.1
- Accelerate version: 0.22.0.dev0
- Accelerate config: not found
- PyTorch version (GPU?): 2.0.1+cu118 (True)
- Tensorflow version (GPU?): 2.12.0 (True)
- Flax version (CPU?/GPU?/TPU?): 0.7.0 (gpu)
- Jax version: 0.4.13
- JaxLib version: 0.4.13
- Using GPU in script?: yes
- Using distributed or parallel set-up in script?: using one GPU
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
!pip install -q datasets
!pip install git+https://github.com/microsoft/LoRA
!pip install git+https://github.com/huggingface/accelerate.git
!pip install -q git+https://github.com/huggingface/peft.git
!pip install -q git+https://github.com/huggingface/transformers.git
!pip install -i https://test.pypi.org/simple/ bitsandbytes
!pip install -q sentencepiece
import torch
import os
os.environ["CUDA_VISIBLE_DEVICES"]="0"
import torch
import torch.nn as nn
import bitsandbytes as bnb
from transformers import AutoTokenizer, AutoConfig, AutoModelForCausalLM
from peft import AutoPeftModelForCausalLM
MODEL_NAME = <some lora llama2 checkpoint>
model = AutoPeftModelForCausalLM.from_pretrained(
MODEL_NAME,
device_map='auto',
low_cpu_mem_usage=True,
torch_dtype=torch.float16,
is_trainable=True
)
class CastOutputToFloat(nn.Sequential):
def forward(self, x): return super().forward(x).to(torch.float32)
model.lm_head = CastOutputToFloat(model.lm_head)
for param in model.parameters():
if param.ndim == 1:
# cast the small parameters (e.g. layernorm) to fp32 for stability
param.data = param.data.to(torch.float32)
model.gradient_checkpointing_enable()
model.enable_input_require_grads()
from datasets import load_dataset
qa_dataset = load_dataset("squad_v2")
def create_prompt(context, question, answer):
if len(answer["text"]) < 1:
answer = "Cannot Find Answer"
else:
answer = answer["text"][0]
prompt_template = f"### CONTEXT\n{context}\n\n### QUESTION\n{question}\n\n### ANSWER\n{answer}</s>"
return prompt_template
mapped_qa_dataset = qa_dataset.map(lambda samples: tokenizer(create_prompt(samples['context'], samples['question'], samples['answers'])))
import transformers
train_args = transformers.TrainingArguments(
per_device_train_batch_size=1,
gradient_accumulation_steps=1,
warmup_steps=100,
max_steps=100,
learning_rate=1e-3,
fp16=True,
logging_steps=1,
output_dir='outputs',
)
trainer = transformers.Trainer(
model=model,
train_dataset=mapped_qa_dataset["train"],
args=train_args,
data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False),
)Trainer init crashes here:
IndexError Traceback (most recent call last)
[<ipython-input-114-29de745c4455>](https://localhost:8080/#) in <cell line: 14>()
12 )
13
---> 14 trainer = transformers.Trainer(
15 model=model,
16 train_dataset=mapped_qa_dataset["train"],
[/usr/local/lib/python3.10/dist-packages/transformers/trainer.py](https://localhost:8080/#) in __init__(self, model, args, data_collator, train_dataset, eval_dataset, tokenizer, model_init, compute_metrics, callbacks, optimizers, preprocess_logits_for_metrics)
380 self.is_model_parallel = True
381 else:
--> 382 self.is_model_parallel = self.args.device != torch.device(devices[0])
383
384 # warn users
IndexError: list index out of range
Expected behavior
Trainer object should be constructed correctly.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels