Conversation
|
Example run of real streaming jobs with this option - https://github.com/apache/beam/actions/runs/10371405701/job/28711629663 - I added it to the streaming test suite, ran the suite, then removed it after it was clear it worked. |
|
Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control. If you'd like to restart, comment |
|
I think there's a secondary fix needed (separate PR) where we just minimize the unnecessary steps submitted in the Dataflow job request when using Runner v2 so the the request does not go over the Dataflow RPC limits (Dataflow with Runner v2 do not use such steps anyways but we just submit them for historical reasons). |
I don't think we need this - upload_graph has never worked with Runner v2 and users don't seem to be running into this problem, so I don't think the problem persists in the v2 case. There's also already some divergence here (e.g. ). There may be other places which need updating, but they at least don't seem problematic at this point |
|
I think these are the cases:
So there might be a corner case for a very large graph that does not choose v1 or v2. In this case, we will automatically turn on |
Note that this will require service changes since I believe currently we use the same Dataflow job request property for both. But agree that if we are not hitting request RPC errors with Runner v2, trimming steps in the the Dataflow job request for Runner v2 might not be a priority. |
|
If the user explicitly requests runner v2, we already clear the steps from the REST API request. |
|
(that was what caused the bug) |
* Fix upload_graph on v2 * compliation nits * compliation nits * remove streaming test change, update CHANGES * mutability fix * Test fix * Remove upload_graph from it
Currently, if
upload_graphis specified we error out when using runner v2. This is exacerbated because we auto-opt in large graphs to this option, so some pipelines can fail without a user even enabling this option.This fixes the problem and does the following:
upload_graphoption for runner v2 pipelines if it is explicitly set.Fixes #32159
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>instead.CHANGES.mdwith noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.