Conversation
|
@ejguan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
@ejguan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
| return new_batch | ||
|
|
||
|
|
||
| @functional_datapipe("async_map_batches") |
There was a problem hiding this comment.
There was a problem hiding this comment.
Yeah, any naming suggestion is welcome
| self, | ||
| source_datapipe: IterDataPipe, | ||
| async_fn: Callable, | ||
| input_col=None, |
There was a problem hiding this comment.
curious if it will becomes something like input_cols/output_cols or it always has to be single column?
There was a problem hiding this comment.
Current state:
input_colaccepts multiple elementsoutput_colis not supported
NivekT
left a comment
There was a problem hiding this comment.
LGTM!
Question:
- Will we add a
AsyncMapperor this will be all? It seems like it will be mostly be the same except maybe instead of asking forbatch_size, automatically set it tobatch_size = max-concurrency. - Providing some async options for IO DataPipes will be nice. Or somehow use this DataPipe in combination with those as an example.
I am not sure. Probably not until there is a solid request. Rather than setting
It's hard to combine with other DataPipes because this DataPipe takes async function directly rather than relying on if prior DataPipe has an async operation. |
|
@ejguan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Changes
BatchAsyncMapperto support async processingbatch().async_map().flatmap()input_col/output_colare added to align the behavior toMapper