The Concater datapipe takes multiple DPs as input. Is there a class that would take a single datapipe of iterables instead? Something like this:
class ConcaterIterable(IterDataPipe):
def __init__(self, source_datapipe):
self.source_datapipe = source_datapipe
def __iter__(self):
for iterable in self.source_datapipe:
yield from iterable
Basically:
itertools.chain == Concater
itertools.chain.from_iterable == ConcaterIterable
Maybe a neat way of implementing this would be to keep a single Concater class, which would fall back to the ConcaterIterable behaviour if it's passed only one DP as input?
Details: I need this for my benchmarking on manifold where each file is a big pickle archive of multiple images. My DP builder looks like this:
def make_manifold_dp(root, dataset_size):
handler = ManifoldPathHandler()
dp = IoPathFileLister(root=root)
dp.register_handler(handler)
dp = dp.shuffle(buffer_size=dataset_size).sharding_filter()
dp = IoPathFileOpener(dp, mode="rb")
dp.register_handler(handler)
dp = PickleLoaderDataPipe(dp)
dp = ConcaterIterable(dp) # <-- Needed here!
return dp