[Fixit][CI] Retry external DNS lookups on transient NOT_FOUND in cf_engine_test#42185
Closed
pawbhard wants to merge 4 commits into
Closed
[Fixit][CI] Retry external DNS lookups on transient NOT_FOUND in cf_engine_test#42185pawbhard wants to merge 4 commits into
pawbhard wants to merge 4 commits into
Conversation
…ngine_test TestResolveRemote, TestResolveIPv4Remote, and TestResolveIPv6Remote depend on external DNS services (localtest.me, nip.io, sslip.io). On Mac CI pool machines these lookups occasionally fail with kNotFound when the upstream resolver cannot reach the authoritative DNS servers, causing flaky test failures. Add LookupWithRetry helper that retries up to 3 times on kNotFound. If all attempts fail the test is skipped rather than failed, since the failure is infrastructure unavailability not a code regression. Retrying only on kNotFound is safe: that status code is only produced by DNSServiceResolverImpl when the DNS server responds with NXDOMAIN for both A and AAAA; bugs in the resolver itself map to kUnknown and will still surface as failures.
rishesh007
approved these changes
Apr 20, 2026
…_engine_test After exhausting retries, fail the test with ASSERT_TRUE rather than silently skipping it, so persistent DNS infrastructure issues are surfaced rather than hidden.
rishesh007
approved these changes
Apr 20, 2026
asheshvidyut
pushed a commit
to asheshvidyut/grpc
that referenced
this pull request
Apr 23, 2026
…ngine_test (grpc#42185) …ngine_test TestResolveRemote, TestResolveIPv4Remote, and TestResolveIPv6Remote depend on external DNS services (localtest.me, nip.io, sslip.io). On Mac CI pool machines these lookups occasionally fail with kNotFound when the upstream resolver cannot reach the authoritative DNS servers, causing flaky test failures. Add LookupWithRetry helper that retries up to 3 times on kNotFound. If all attempts fail the test is skipped rather than failed, since the failure is infrastructure unavailability not a code regression. Retrying only on kNotFound is safe: that status code is only produced by DNSServiceResolverImpl when the DNS server responds with NXDOMAIN for both A and AAAA; bugs in the resolver itself map to kUnknown and will still surface as failures. Closes grpc#42185 COPYBARA_INTEGRATE_REVIEW=grpc#42185 from pawbhard:retry_dns 36bfcfd PiperOrigin-RevId: 902619277
asheshvidyut
pushed a commit
to a-detiste/grpc
that referenced
this pull request
Jun 10, 2026
…ngine_test (grpc#42185) …ngine_test TestResolveRemote, TestResolveIPv4Remote, and TestResolveIPv6Remote depend on external DNS services (localtest.me, nip.io, sslip.io). On Mac CI pool machines these lookups occasionally fail with kNotFound when the upstream resolver cannot reach the authoritative DNS servers, causing flaky test failures. Add LookupWithRetry helper that retries up to 3 times on kNotFound. If all attempts fail the test is skipped rather than failed, since the failure is infrastructure unavailability not a code regression. Retrying only on kNotFound is safe: that status code is only produced by DNSServiceResolverImpl when the DNS server responds with NXDOMAIN for both A and AAAA; bugs in the resolver itself map to kUnknown and will still surface as failures. Closes grpc#42185 COPYBARA_INTEGRATE_REVIEW=grpc#42185 from pawbhard:retry_dns 36bfcfd PiperOrigin-RevId: 902619277
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…ngine_test
TestResolveRemote, TestResolveIPv4Remote, and TestResolveIPv6Remote depend on external DNS services (localtest.me, nip.io, sslip.io). On Mac CI pool machines these lookups occasionally fail with kNotFound when the upstream resolver cannot reach the authoritative DNS servers, causing flaky test failures.
Add LookupWithRetry helper that retries up to 3 times on kNotFound. If all attempts fail the test is skipped rather than failed, since the failure is infrastructure unavailability not a code regression. Retrying only on kNotFound is safe: that status code is only produced by DNSServiceResolverImpl when the DNS server responds with NXDOMAIN for both A and AAAA; bugs in the resolver itself map to kUnknown and will still surface as failures.