Skip to content

default pin_memory_fn with string Sequences and batch_size 1 #1157

@Modexus

Description

@Modexus

🐛 Describe the bug

Pinning collated batches of size 1 of strings yields the string in a list instead of the string itself.
E.g. ['hello'] -> ['h', 'e', 'l', 'l', 'o']

from torchdata.datapipes.iter import IterableWrapper

dp = IterableWrapper(["hello", "world"])
dp = dp.batch(1)
dp = dp.collate()
dp = dp.pin_memory()

print(next(iter(dp)))
['h', 'e', 'l', 'l', 'o']

Seems to be because in the pin_memory_fn on line 30 type(data)(*pinned_data) calls list on the string itself. For batch sizes > 1 this leads to an error which just returns pinned_data.
Should this line be type(data)(pinned_data)?

Versions

torchdata.version=='0.7.0a0+cacf355'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions