System Info
transformers version: 4.44.2
- Platform: Linux-6.1.85+-x86_64-with-glibc2.35
- Python version: 3.10.12
- Huggingface_hub version: 0.24.7
- Safetensors version: 0.4.5
- Accelerate version: 0.34.2
- Accelerate config: not found
- PyTorch version (GPU?): 2.4.1+cu121 (False)
- Tensorflow version (GPU?): 2.17.0 (False)
- Flax version (CPU?/GPU?/TPU?): 0.8.5 (cpu)
- Jax version: 0.4.33
- JaxLib version: 0.4.33
- Using distributed or parallel set-up in script?: No
Who can help?
@zucchini-nlp @gante
Information
Tasks
Reproduction
Hey!
I noticed that top_p was silenctly failing so I tested the rest of the generation parameters and found that no_repeat_ngram_size also silently fails for the same reason: the condition checks inside of the _get_logits_processor() method prevent their respective wrapper classes from executing, which is where the ValueError are being raised.
For instance, raise ValueError(f"`ngram_size` has to be a strictly positive integer, but is {ngram_size}") error is never reached when we set no_repeat_ngram_size <= 0.
Here is a simple example with the invalid values where generation proceeds without notifying the user. Ideally, those should raise errors or warnings.
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "EleutherAI/pythia-14m"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
prompt = "hey there!"
inputs = tokenizer(prompt, return_tensors="pt")
generation_config = dict(do_sample=True, top_p=5, no_repeat_ngram_size=-1)
outputs = model.generate(input_ids=inputs['input_ids'], attention_mask=inputs['attention_mask'], **generation_config)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
response
Expected behavior
-
To not let things fail silently and proceed with a default value, instead raise a ValueError or issue a warning to the user.
-
It would be great if the generate method could fail early when invalid values are passed, maybe by checking for them upfront in _get_logits_processor before applying the generation parameters one by one and going through the entire process, this will help avoid wasting compute resources.
I would be happy to open a PR to help address this issue if that’s possible, thank you for all your work!
System Info
transformersversion: 4.44.2Who can help?
@zucchini-nlp @gante
Information
Tasks
examplesfolder (such as GLUE/SQuAD, ...)Reproduction
Hey!
I noticed that
top_pwas silenctly failing so I tested the rest of the generation parameters and found thatno_repeat_ngram_sizealso silently fails for the same reason: the condition checks inside of the_get_logits_processor()method prevent their respective wrapper classes from executing, which is where the ValueError are being raised.For instance,
raise ValueError(f"`ngram_size` has to be a strictly positive integer, but is {ngram_size}")error is never reached when we setno_repeat_ngram_size <= 0.Here is a simple example with the invalid values where generation proceeds without notifying the user. Ideally, those should raise errors or warnings.
Expected behavior
To not let things fail silently and proceed with a default value, instead raise a ValueError or issue a warning to the user.
It would be great if the
generatemethod could fail early when invalid values are passed, maybe by checking for them upfront in_get_logits_processorbefore applying the generation parameters one by one and going through the entire process, this will help avoid wasting compute resources.I would be happy to open a PR to help address this issue if that’s possible, thank you for all your work!