Describe the bug
streaming dataset with concatenating splits raises an error
Steps to reproduce the bug
from datasets import load_dataset
# no error
repo = "nateraw/ade20k-tiny"
dataset = load_dataset(repo, split="train+validation")
from datasets import load_dataset
# error
repo = "nateraw/ade20k-tiny"
dataset = load_dataset(repo, split="train+validation", streaming=True)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
[<ipython-input-4-a6ae02d63899>](https://localhost:8080/#) in <module>()
3 # error
4 repo = "nateraw/ade20k-tiny"
----> 5 dataset = load_dataset(repo, split="train+validation", streaming=True)
1 frames
[/usr/local/lib/python3.7/dist-packages/datasets/builder.py](https://localhost:8080/#) in as_streaming_dataset(self, split, base_path)
1030 splits_generator = splits_generators[split]
1031 else:
-> 1032 raise ValueError(f"Bad split: {split}. Available splits: {list(splits_generators)}")
1033
1034 # Create a dataset for each of the given splits
ValueError: Bad split: train+validation. Available splits: ['validation', 'train']
Colab
Expected results
load successfully or throws an error saying it is not supported.
Actual results
above
Environment info
datasets version: 2.4.0
- Platform: Windows-10-10.0.22000-SP0 (windows11 x64)
- Python version: 3.9.13
- PyArrow version: 8.0.0
- Pandas version: 1.4.3
Describe the bug
streaming dataset with concatenating splits raises an error
Steps to reproduce the bug
Colab
Expected results
load successfully or throws an error saying it is not supported.
Actual results
above
Environment info
datasetsversion: 2.4.0