Skip to content

Silent failure in generation parameters  #33690

@Manalelaidouni

Description

@Manalelaidouni

System Info

  • transformers version: 4.44.2
  • Platform: Linux-6.1.85+-x86_64-with-glibc2.35
  • Python version: 3.10.12
  • Huggingface_hub version: 0.24.7
  • Safetensors version: 0.4.5
  • Accelerate version: 0.34.2
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.4.1+cu121 (False)
  • Tensorflow version (GPU?): 2.17.0 (False)
  • Flax version (CPU?/GPU?/TPU?): 0.8.5 (cpu)
  • Jax version: 0.4.33
  • JaxLib version: 0.4.33
  • Using distributed or parallel set-up in script?: No

Who can help?

@zucchini-nlp @gante

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Hey!

I noticed that top_p was silenctly failing so I tested the rest of the generation parameters and found that no_repeat_ngram_size also silently fails for the same reason: the condition checks inside of the _get_logits_processor() method prevent their respective wrapper classes from executing, which is where the ValueError are being raised.

For instance, raise ValueError(f"`ngram_size` has to be a strictly positive integer, but is {ngram_size}") error is never reached when we set no_repeat_ngram_size <= 0.

Here is a simple example with the invalid values where generation proceeds without notifying the user. Ideally, those should raise errors or warnings.

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "EleutherAI/pythia-14m"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
prompt = "hey there!"
inputs = tokenizer(prompt, return_tensors="pt")
generation_config = dict(do_sample=True, top_p=5, no_repeat_ngram_size=-1)

outputs = model.generate(input_ids=inputs['input_ids'], attention_mask=inputs['attention_mask'], **generation_config)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
response

Expected behavior

  • To not let things fail silently and proceed with a default value, instead raise a ValueError or issue a warning to the user.

  • It would be great if the generate method could fail early when invalid values are passed, maybe by checking for them upfront in _get_logits_processor before applying the generation parameters one by one and going through the entire process, this will help avoid wasting compute resources.

I would be happy to open a PR to help address this issue if that’s possible, thank you for all your work!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions