Skip to content

fix(cache): prefer positive cache over SERVFAIL in ncache#8003

Merged
yongtang merged 1 commit into
coredns:masterfrom
umut-polat:fix/cache-servfail-shadow
May 20, 2026
Merged

fix(cache): prefer positive cache over SERVFAIL in ncache#8003
yongtang merged 1 commit into
coredns:masterfrom
umut-polat:fix/cache-servfail-shadow

Conversation

@umut-polat

Copy link
Copy Markdown
Contributor

When serve_stale is enabled, a cached SERVFAIL in ncache shadows a valid positive entry in pcache because ncache is always checked first in getIfNotStale. SERVFAIL is transient (e.g. upstream hiccup) and should not mask a known-good answer that is still servable.

What changed

  • plugin/cache/handler.go: when the ncache hit is a SERVFAIL, check pcache for a valid entry before returning. If a positive entry exists and is servable (fresh or within the stale window), return it instead.
  • plugin/cache/cache_test.go: add TestServfailDoesNotShadowPositiveCache that injects both a SERVFAIL in ncache and a positive entry in pcache, then verifies the lookup returns the positive entry.

NXDOMAIN and NODATA follow the existing ncache-first order per RFC 2308 and are unaffected.

Repro (before fix)

With serve_stale enabled:

  1. upstream returns a valid A record -> cached in pcache
  2. upstream temporarily returns SERVFAIL -> cached in ncache
  3. subsequent queries return the cached SERVFAIL instead of the valid positive entry, until the SERVFAIL TTL (or stale window) expires

Fixes #7956

When serve_stale is enabled, a cached SERVFAIL in ncache shadows a valid
positive entry in pcache because ncache is always checked first. SERVFAIL
is transient and should not mask a known-good answer.

When the ncache hit is a SERVFAIL, check pcache for a valid entry before
returning the SERVFAIL. NXDOMAIN and NODATA are unaffected and still
follow the existing ncache-first lookup per RFC 2308.

Fixes coredns#7956

Signed-off-by: umut-polat <52835619+umut-polat@users.noreply.github.com>
@umut-polat umut-polat force-pushed the fix/cache-servfail-shadow branch from cf8cac1 to db53a3d Compare April 9, 2026 20:22
@yongtang yongtang merged commit b1a7fc8 into coredns:master May 20, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cache plugin: SERVFAIL in ncache can shadow valid positive responses and delay recovery (interaction with serve_stale)

2 participants