[BEAM-7806] Revert "[BEAM-7785] synchronize watermark update with bundle processing"#9151
Conversation
This reverts commit 84ef8aa. After this commit, org.apache.beam.sdk.extensions.sql.meta.provider.pubsub.PubsubJsonIT.testSelectsPayloadContent began failing. Rolling back to restore green/red signal while we deduce whether the problem is with the test or this change.
|
CC: @je-ik As according to https://beam.apache.org/contribute/postcommits-policies-details/index.html#rollback_first I am rolling back first. And then we can figure out what needs to be done. |
|
run sql postcommit |
|
Noting from the Jira:
|
|
+1 to rollback first. Though it's not obvious how this change would directly cause the test to fail. The failing run seems to be unable to get the result signal within a timeout, is there a chance that it's a flake? (I recognize it's not likely given the scans :) The test itself I think is also relatively flaky and complicated and is probably full of race conditions, so depending on what makes it pass/fail this change can be either fixing something that was wrong and allowed the test to accidentally pass, or the test accidentally exercises some edge case that this change breaks, unclear. However, there is also another test that is |
|
No, it is not a flake. It went perma-red after the commit for many runs, and then I could reproduce it locally every time. |
|
But we should definitely consider that it could be the test that is broken. However, I commented on the other PR that the way to get consistency is not through locks but by immutable snapshots. It could even be that it just created a major performance problem that took more than the timeout. TBH I am not sure. |
|
Given that post commit back to green now: https://builds.apache.org/job/beam_PostCommit_SQL/, I would think that watermark changes break the test, or at least has side effect on that. |
|
Yes, the PR accidentally caused deadlock in the test. Synchronizing the watermark update unintentionally limited parallelism of bundle processing to 1. This didn't cause any issues with existing tests or validates runner, but affected the integration test, because it performs blocking IO (and depends on another bundle to be unblocked). I have added test case to direct runner that exhibits this behavior and will fix the original issue without limiting the parallelism. |
This reverts commit 84ef8aa.
After this commit, org.apache.beam.sdk.extensions.sql.meta.provider.pubsub.PubsubJsonIT.testSelectsPayloadContent began failing. Rolling back to restore green/red signal while we deduce whether the problem is with the test or this change.
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username).[BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replaceBEAM-XXXwith the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.Post-Commit Tests Status (on master branch)
Pre-Commit Tests Status (on master branch)
See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.