GenerationConfig.from_pretrained raise ValueError after training, maybe raise it earlier?

### System Info

- `transformers` version: 4.38.2
- Platform: Linux-4.18.0-305.3.1.el8.x86_64-x86_64-with-glibc2.28
- Python version: 3.10.13
- Huggingface_hub version: 0.21.4
- Safetensors version: 0.4.2
- Accelerate version: 0.28.0
- Accelerate config:    - compute_environment: LOCAL_MACHINE
        - distributed_type: MULTI_GPU
        - mixed_precision: no
        - use_cpu: False                                                                                                                - debug: False
        - num_processes: 8
        - machine_rank: 0                                                                                                               - num_machines: 1
        - gpu_ids: all
        - rdzv_backend: static                                                                                                          - same_network: True                                                                                                            - main_training_function: main
        - downcast_bf16: no
        - tpu_use_cluster: False
        - tpu_use_sudo: False
        - tpu_env: []
- PyTorch version (GPU?): 2.2.1 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>

### Who can help?

@gante   @pacman100   @muellerzr 

**Raise Errors as Early as Possible**: I noticed that `GenerationConfig.save_pretrained` in `transformers/generation/configuration_utils.py` will raise a `ValueError` if the config cannot pass the validation. I think it's better to raise the error earlier (e.g., after `self.validate` in `__init__`) instead of raising it in `Trainer._save`. Users might be upset after several hours of training and finding the model checkpoint is not saved.

For example, finetuning [LLaVA](https://github.com/haotian-liu/LLaVA) will raise this error. [Issue#1252](https://github.com/haotian-liu/LLaVA/issues/1252#issue) and [issue#1144](https://github.com/haotian-liu/LLaVA/issues/1144#issue) meet the same phenomenon.


Please correct me if I am wrong. Thanks!

### Information

- [ ] The official example scripts
- [X] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [X] My own task or dataset (give details below)

### Reproduction

Maybe no reproduction is necessary.

1. Install [LLaVA](https://github.com/haotian-liu/LLaVA) by following the [guide](https://github.com/haotian-liu/LLaVA#install).
2. Train and finetune the model by following the [guide](https://github.com/haotian-liu/LLaVA#train). [Issue#1252](https://github.com/haotian-liu/LLaVA/issues/1252#issue) and [issue#1144](https://github.com/haotian-liu/LLaVA/issues/1144#issue) also give the training script.

### Expected behavior

Raise the ValueError before training if PretrainedConfig cannot pass the validation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GenerationConfig.from_pretrained raise ValueError after training, maybe raise it earlier? #29665

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

GenerationConfig.from_pretrained raise ValueError after training, maybe raise it earlier? #29665

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions