`trainer.evaluate` does throws an error when using multiple evaluation dataset 

### System Info

```
- `transformers` version: 4.31.0.dev0
- Platform: macOS-12.5.1-arm64-arm-64bit
- Python version: 3.8.16
- Huggingface_hub version: 0.14.1
- Safetensors version: 0.3.1
- PyTorch version (GPU?): 2.0.1 (False)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: No
- Using distributed or parallel set-up in script?: No
```

### Who can help?

@sgugger 

### Information

- [X] The official example scripts
- [ ] My own modified scripts

### Tasks

- [X] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [X] My own task or dataset (give details below)

### Reproduction

1. use transformers [examples code](https://github.com/huggingface/transformers/tree/main/examples/pytorch/summarization) on summarization (or any other)
2. Pass multiple evaluation dataset as follows when running the code. This should be supported as [documented here](https://huggingface.co/docs/transformers/main_classes/trainer#transformers.Trainer.eval_dataset).
```
python examples/pytorch/summarization/run_summarization.py \
    --model_name_or_path t5-small \
    --do_train \
    --do_eval \
    --train_file "a.txt" "b.txt" \
    --validation_file "a_valid.txt" "b_valid.txt"
```

### Expected behavior

The example code should not meet an error when `trainer.evaluate` is called, which is either 1) inte·mittently during training and 2) [at the end of the training](https://github.com/huggingface/transformers/blob/5bb4430edc7df9f9950d412d98bbe505cc4d328b/examples/pytorch/summarization/run_summarization.py#L695). 

During the training, Trainer [checks if the the passed `eva_dataset` consists of multiple or not](https://github.com/huggingface/transformers/blob/5bb4430edc7df9f9950d412d98bbe505cc4d328b/src/transformers/trainer.py#L2216). Since this line is missing in the 2) at the end of training evaluation case, this meets an error. I'd be happy to make a PR on this :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`trainer.evaluate` does throws an error when using multiple evaluation dataset #24832

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

trainer.evaluate does throws an error when using multiple evaluation dataset #24832

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`trainer.evaluate` does throws an error when using multiple evaluation dataset #24832