[Task]: Spark runner flatMap output should not be required to fit in the memory

### What needs to happen?

Currently on Spark runner, if single processElement call produces multiple output elements, they all needs to fit in the memory [1]. This is problematic e.g. for `ParquetIO`, which instead of `Source<>` based reads uses `DoFn` and let reader from inside DoFn push all elements to the output. Similar happens with `JdbcIO` and was discussed here [2].

The goal is to overcome this constraint and allow to produce large output from DoFn on Spark runner. 


[1] https://github.com/apache/beam/blob/v2.39.0/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/SparkProcessContext.java#L125

[2] https://www.mail-archive.com/dev@beam.apache.org/msg16806.html

### Issue Priority

Priority: 2

### Issue Component

Component: runner-spark

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Task]: Spark runner flatMap output should not be required to fit in the memory #23852

What needs to happen?

Issue Priority

Issue Component

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Task]: Spark runner flatMap output should not be required to fit in the memory #23852

Description

What needs to happen?

Issue Priority

Issue Component

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions