Skip to content

Add generation config validation using Pydantic#35910

Closed
Manalelaidouni wants to merge 4 commits into
huggingface:mainfrom
Manalelaidouni:add-pydantic-config-validation
Closed

Add generation config validation using Pydantic#35910
Manalelaidouni wants to merge 4 commits into
huggingface:mainfrom
Manalelaidouni:add-pydantic-config-validation

Conversation

@Manalelaidouni

@Manalelaidouni Manalelaidouni commented Jan 27, 2025

Copy link
Copy Markdown
Contributor

What does this PR do?

Follow up to #34726

The goal of this PR is to add validation for generation config using Pydantic to fail early when invalid values are passed and not let things fail silently as suggested in issue #33690.

  • Adding extra="allow” in Pydantic ConfigDict replaces the code responsible for setting additional attributes without default values.
  • Moved individual validation logic from validate() into the constructor.
  • Sneaked in sequence_bias docstring correction because it’s still referencing the older dictionary format which is confusing for users.
  • Moved classes to avoid the Undefined name error by Ruff.
  • Added Pydantic Import Error to require having Pydantic in env while specifying min version Pydantic V2.0.

Who can review?

@ArthurZucker @zucchini-nlp

@Rocketknight1

Copy link
Copy Markdown
Member

cc @gante for generation config as well

@ArthurZucker

Copy link
Copy Markdown
Collaborator

cc @LysandreJik as well!

@gante gante left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a first glance, big yes from me. It becomes immediately clear which values are allowed in each attribute, except for combinations of values.

All my comments are due to being inexperienced with pydantic, otherwise LGTM

Comment thread src/transformers/generation/configuration_utils.py Outdated
Comment thread src/transformers/generation/configuration_utils.py Outdated
Comment thread src/transformers/generation/configuration_utils.py
@Manalelaidouni

Copy link
Copy Markdown
Contributor Author

Most of the failing tests seem unrelated, will fix the unprotected import.

@zucchini-nlp zucchini-nlp left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice PR, cleaner validation of config is much needed. I just left a couple questions, just to be sure I understand it correctly

Comment thread src/transformers/generation/configuration_utils.py Outdated
Comment thread src/transformers/generation/configuration_utils.py Outdated
Comment thread src/transformers/generation/configuration_utils.py Outdated
Comment thread src/transformers/generation/configuration_utils.py
Comment thread src/transformers/generation/configuration_utils.py Outdated
Comment thread src/transformers/generation/configuration_utils.py Outdated
Comment thread src/transformers/generation/configuration_utils.py Outdated
Comment thread src/transformers/generation/configuration_utils.py
Comment thread src/transformers/generation/configuration_utils.py Outdated
Comment thread src/transformers/generation/configuration_utils.py
Comment thread src/transformers/generation/configuration_utils.py Outdated
Comment thread src/transformers/generation/configuration_utils.py Outdated
Comment thread src/transformers/generation/configuration_utils.py Outdated
Comment thread src/transformers/generation/configuration_utils.py Outdated
@gante

gante commented Feb 20, 2025

Copy link
Copy Markdown
Contributor

@Manalelaidouni added a bunch of comments with more precise limits :)

LMK if you need help getting the CI green ✅

@Manalelaidouni

Manalelaidouni commented Feb 21, 2025

Copy link
Copy Markdown
Contributor Author

When working on the improtected import, I implemented the lazy code logic for Pydantic in the main init and the generation module’s init, however the dummy routing only works when importing from transformers top level and not the generation module’s level, because there are other non-pydantic dependent objects that directly import from configuration_utils.py (e.g, NEED_SETUP_CACHE_CLASSES_MAPPING, QUANT_BACKEND_CLASSES_MAPPING, GenerationMode, etc) and bypass the lazy loading so configuration_utils.py is always executed.

This means I either need to move GenerationConfig to a seperate module so it won’t force a direct import, or guard GenerationConfig inside if is_pydantic_available and make all GenerationConfig imports from the main init instead of relative imports :

if is_pydantic_available():
     from pydantic import BaseModel, ConfigDict, PrivateAttr, confloat, conint, model_validator
	  class GenerationConfig(BaseModel, PushToHubMixin):
	       ...

Although, this means the nice import error msg in requires_backend won’t be triggered because the dummy GenerationConfig won’t be actually instantiated.

Thank you guys and let me know what you think before I proceed :)

@gante

gante commented Feb 21, 2025

Copy link
Copy Markdown
Contributor

@Manalelaidouni since the GenerationConfig object will now depend entirely on pydantic, I think we can promote pydantic to a mandatory requirement and get rid of the import guards. Let me confirm internally.

@gante

gante commented Feb 21, 2025

Copy link
Copy Markdown
Contributor

@Manalelaidouni We (me, @LysandreJik, and @ArthurZucker) discussed what to do regarding install requirements internally. Adding a mandatory requirement to a popular library like transformers is not a decision we should make lightly; it may place unwanted constraints on many projects.

That being said, we want to add more validation to certain classes (GenerationConfig, PretrainedConfig, processors, ...), and pydantic seems like the right tool for it. If we don't make it a mandatory requirement, we'll have to add an import guard in most files in our library, which is also unwanted.

Here's what we're going to do: I'm going to open a GH issue (and share it on social media) to see what the community thinks and to try to anticipate problems. Then, in about a week, let's make an informed decision 🤗

@gante

gante commented Feb 28, 2025

Copy link
Copy Markdown
Contributor

Hey @Manalelaidouni 👋

Thank you for holding this discussion with us -- we've decided no NOT move forward with pydantic. See my comment here 🤗

@Wauplin

Wauplin commented Feb 28, 2025

Copy link
Copy Markdown
Contributor

I opened huggingface/huggingface_hub#2895 as a suggestion on how to move forward with data validation in the HF ecosystem. Open to feedback and subject to change based on community requirements 🤗

@Manalelaidouni

Copy link
Copy Markdown
Contributor Author

Closing this one since the new solution is being implemented.

@chenbinghui1

Copy link
Copy Markdown

hi
my transformers is 4.57.3. when i using

 model = Qwen2ForCausalLM.from_pretrained(
            llm_path,
            config=llm_config,
            torch_dtype=torch.bfloat16,
        )

I encountered the problem:

   `cannot import name 'NEED_SETUP_CACHE_CLASSES_MAPPING' from 'transformers.generation.configuration_utils'`

how can i solve this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants