Skip to content

[#29760] Only respond to sampling request while data sampling is enabled#29761

Merged
lostluck merged 2 commits intoapache:masterfrom
zechenj18:only-respond-to-sampling-request-when-enabled
Dec 13, 2023
Merged

[#29760] Only respond to sampling request while data sampling is enabled#29761
lostluck merged 2 commits intoapache:masterfrom
zechenj18:only-respond-to-sampling-request-when-enabled

Conversation

@zechenj18
Copy link
Copy Markdown
Contributor

@zechenj18 zechenj18 commented Dec 13, 2023

Fix to panic caused by handling data sampling request when it's not enabled.

Fixes #29760


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

@github-actions github-actions bot added the go label Dec 13, 2023
@lostluck
Copy link
Copy Markdown
Contributor

Run Go PostCommit

@codecov
Copy link
Copy Markdown

codecov bot commented Dec 13, 2023

Codecov Report

Attention: 11 lines in your changes are missing coverage. Please review.

Comparison is base (ef0ee76) 38.07% compared to head (7313413) 38.07%.

Files Patch % Lines
sdks/go/pkg/beam/core/runtime/harness/harness.go 0.00% 11 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #29761      +/-   ##
==========================================
- Coverage   38.07%   38.07%   -0.01%     
==========================================
  Files         696      696              
  Lines      101534   101534              
==========================================
- Hits        38656    38655       -1     
- Misses      61263    61264       +1     
  Partials     1615     1615              
Flag Coverage Δ
go 53.64% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

},
}
case req.GetSampleData() != nil:
case req.GetSampleData() != nil && c.dataSampler != nil:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't an ideal fix.

As much as the Runner shouldn't be sending sample requests, this sends "fail" response back to the Runner, which may abort job processing depending on the runner.

It would be better to send "empty" responses with no sampleData in this case instead.

if c.dataSampler != nil {
	return &fnpb.InstructionResponse{
			InstructionId: string(instID),
			Response: &fnpb.InstructionResponse_SampleData{},
		}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, yes, Dataflow is incorrect to be sending these requests, however, the SDK shouldn't send responses that might cause the job to fail.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated with empty response when data sampling is not enabled.

@lostluck
Copy link
Copy Markdown
Contributor

"Run Go PostCommit"

@lostluck
Copy link
Copy Markdown
Contributor

Run Go PostCommit

Copy link
Copy Markdown
Contributor

@lostluck lostluck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'll merge in once I validate the postcommit.

@lostluck lostluck changed the title Only respond to sampling request while data sampling is enabled [#29760] Only respond to sampling request while data sampling is enabled Dec 13, 2023
@lostluck
Copy link
Copy Markdown
Contributor

Apparently PostCommit trigger phrases are currently too expensive (fair), so a work around is to add a file to the PR. (The one listed here: https://github.com/apache/beam/blob/master/.github%2Fworkflows%2Fbeam_PostCommit_Go.yml#L22C1-L22C1)

I'll skip that for now since the fix is straightforward. Merging once the precommits are done.

@lostluck lostluck merged commit 19858e9 into apache:master Dec 13, 2023
@lostluck
Copy link
Copy Markdown
Contributor

PostCommit run with this change: https://github.com/apache/beam/actions/runs/7201639105

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Go SDK Dataflow jobs fail on DataSampling disabled

2 participants