Parsing error with --lr-scheduler-kwargs: invalid dict value

### System Info

- `transformers` version: 4.57.0.dev0
- Platform: Linux-5.10.134-008.18.kangaroo.al8.x86_64-x86_64-with-glibc2.39
- Python version: 3.12.7+gc
- Huggingface_hub version: 1.0.0.rc2
- Safetensors version: 0.5.3
- Accelerate version: 1.7.0
- Accelerate config:    not found
- DeepSpeed version: 0.16.9+ali
- PyTorch version (accelerator?): 2.6.0+ali.7.post2.ppu1.5.2.cu126 (CUDA)
- Using distributed or parallel set-up in script?: <fill in>
- Using GPU in script?: <fill in>
- GPU type: PPU-ZW810E

### Who can help?

When using the `--lr-scheduler-kwargs` option with a dict argument, such as `'{"min_lr": 1e-06}'`, the parser reports the following error:

```text
test.py: error: argument --lr_scheduler_kwargs/--lr-scheduler-kwargs: invalid dict value: '{"min_lr": 1e-06}'
```

It appears that this error was introduced in https://github.com/huggingface/transformers/commit/a61fc6a0b985548c681df89c15f8053dafe322bf (title: "Fix typing of train_args"). It works as expected when reversing the data type of `lr_scheduler_kwargs` back to `Optional[Union[dict[str, Any], str]]`.

### Information

- [x] The official example scripts
- [x] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [x] My own task or dataset (give details below)

### Reproduction

You can simply reproduce it with a short script, such as:

```python
from transformers import TrainingArguments, HfArgumentParser

parser = HfArgumentParser((TrainingArguments,))
training_args, = parser.parse_args_into_dataclasses()
```

and then run it with `python test.py --lr-scheduler-kwargs '{"min_lr": 1e-06}'`

It will report the following error:

```
usage: test.py [-h] [--output_dir OUTPUT_DIR] [--overwrite_output_dir [OVERWRITE_OUTPUT_DIR]] [--do_train [DO_TRAIN]] [--do_eval [DO_EVAL]]
            [--do_predict [DO_PREDICT]] [--eval_strategy {no,steps,epoch}] [--prediction_loss_only [PREDICTION_LOSS_ONLY]]                                              [--per_device_train_batch_size PER_DEVICE_TRAIN_BATCH_SIZE] [--per_device_eval_batch_size PER_DEVICE_EVAL_BATCH_SIZE]                                       [--gradient_accumulation_steps GRADIENT_ACCUMULATION_STEPS] [--eval_accumulation_steps EVAL_ACCUMULATION_STEPS] [--eval_delay EVAL_DELAY]                   [--torch_empty_cache_steps TORCH_EMPTY_CACHE_STEPS] [--learning_rate LEARNING_RATE] [--weight_decay WEIGHT_DECAY] [--adam_beta1 ADAM_BETA1]                 [--adam_beta2 ADAM_BETA2] [--adam_epsilon ADAM_EPSILON] [--max_grad_norm MAX_GRAD_NORM] [--num_train_epochs NUM_TRAIN_EPOCHS]
...
            [--eval_use_gather_object [EVAL_USE_GATHER_OBJECT]] [--average_tokens_across_devices [AVERAGE_TOKENS_ACROSS_DEVICES]]
            [--no_average_tokens_across_devices]
test.py: error: argument --lr_scheduler_kwargs/--lr-scheduler-kwargs: invalid dict value: '{"min_lr": 1e-06}'
```

### Expected behavior

It's expected that the parser parses the string `'{"min_lr": 1e-06}'` as a dictionary.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parsing error with --lr-scheduler-kwargs: invalid dict value #41296

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Parsing error with --lr-scheduler-kwargs: invalid dict value #41296

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions