Skip to content

plugin/forward: add max_age option to enforce an absolute connection lifetime#7903

Merged
yongtang merged 2 commits into
coredns:masterfrom
cangming:forward-max-age
Mar 9, 2026
Merged

plugin/forward: add max_age option to enforce an absolute connection lifetime#7903
yongtang merged 2 commits into
coredns:masterfrom
cangming:forward-max-age

Conversation

@cangming

@cangming cangming commented Mar 6, 2026

Copy link
Copy Markdown
Contributor

1. Why is this pull request needed and what does it do?

Background

PR #7790 changed the connection pool from LIFO to FIFO for source-port diversity. Under FIFO every connection is cycled through the pool on each request, continuously refreshing its used timestamp. When request rate is high enough that pool_size / request_rate < expire, no connection ever becomes idle and expire (which is based on last-used time) never fires.

This creates a regression in Kubernetes environments using ClusterIP services: because of conntrack pinning, each existing connection is pinned to one specific backend pod. After a transient network disruption or a scale-out event, affected pods receive no new traffic from CoreDNS until old connections are closed and re-dialled. Before #7790 (LIFO), the expire idle-timeout would eventually clear old connections during any quiet period. Under FIFO at sustained load, no such recovery occurs.

The following graph captures this regression in production. CoreDNS 1.14.1 forwards to a Kubernetes ClusterIP service backed by multiple bind9 pods. Around 16:50, one upstream pod experienced a brief network disruption. The blue line (that pod's query count) drops sharply to near zero. Despite the pod recovering quickly, the traffic imbalance persisted for over 55 minutes — CoreDNS never re-opened connections to that pod because the FIFO-rotated connection pool kept refreshing used timestamps, preventing expire from ever firing.

image

Bind Query Count — Each line represents query traffic to one upstream bind9 pod behind a ClusterIP service. At 16:50, the blue pod suffered a brief network disruption. While the pod recovered within seconds, CoreDNS continued directing negligible traffic to it for the remainder of the observation window (~55 min), demonstrating that expire alone cannot restore balance under sustained FIFO rotation.

What this PR does

Adds a max_age option to the forward plugin (default 0 = unlimited, preserving existing behaviour) that closes connections based on creation time rather than idle time. When configured, connections older than max_age are rejected from the cache and re-dialled, ensuring traffic rebalances to recovered or newly scaled-out upstreams regardless of request rate.

Implementation details:

  • persistConn gains a created field, set once at dial time.
  • Transport gains maxAge (zero value = disabled) and SetMaxAge().
  • Dial(): checks pc.created against the max-age deadline before returning a cached connection. When max_age fires, cleanup() (which runs every defaultExpire) typically clears stale connections before Dial() encounters them, keeping the hot path O(1); the O(n) scan in Dial() is an existing characteristic of the FIFO pool and is not worsened by this change.
  • cleanup(): when maxAge > 0, a linear scan evaluates both the idle-timeout (used) and max-age (created) predicates; when maxAge == 0 the original O(log n) binary-search path is fully preserved.
  • Both hot paths pre-compute the deadline (time.Now().Add(-maxAge)) outside inner loops to avoid repeated time.Now() calls.
  • The parser rejects max_age < expire at startup with a clear error message, since a max-age shorter than the idle-timeout would make expire unreachable under active load.
  • Two regression tests added: TestMaxAgeExpireByCreation (fresh used, old created must be rejected) and TestMaxAgeFIFORotation (FIFO-rotated pool must not bypass max-age).

Example Corefile

forward . 1.2.3.4 {
    max_age 30s
}

2. Which issues (if any) are related?

Related to #6804, which proposes a similar max_connection_age option. Key differences from that PR:

#6804 This PR
Option name max_connection_age max_age (consistent with expire style)
expire handling Deprecates expire, introduces idle_timeout as replacement expire fully preserved, zero changes to its semantics
Scope Bundles LIFO→FIFO change with max-age LIFO→FIFO already landed in #7790; this PR only adds max_age
cleanup() scan Always linear O(n) O(log n) binary search preserved when max_age is not set
Root cause context Written before #7790 was merged Specifically addresses the FIFO idle-timeout regression introduced by #7790
Validation Not mentioned max_age < expire rejected at parse time

The main motivation for a separate PR is that #6804's deprecation of expire is a backward-incompatible argument change, whereas this PR is purely additive.

3. Which documentation changes (if any) need to be made?

plugin/forward/README.md should document the new option. Suggested additions:

In the syntax block:

    max_age DURATION

In the description list:

* `max_age` **DURATION**, close cached connections that have been open longer than this duration,
  regardless of activity. Default is 0 (unlimited). Must not be less than `expire` if set.
  Useful for ensuring CoreDNS reconnects to newly scaled-out or recovered upstreams when
  `expire` (idle-timeout) cannot fire under sustained load.

4. Does this introduce a backward incompatible change or deprecation?

No. The default value is 0 (unlimited), identical to the current behaviour. All existing configurations continue to work without any change. expire semantics are completely untouched.

@thevilledev thevilledev left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch @cangming! I think this is the good approach. Just one minor comment, other than that LGTM.

Comment thread plugin/pkg/proxy/persistent.go Outdated
cangming added 2 commits March 7, 2026 12:26
Introduce a max_age setting on Transport that closes connections based
on creation time, independent of idle-timeout (expire).

Background: PR coredns#7790 changed the connection pool from LIFO to FIFO for
source-port diversity. Under FIFO, every connection is cycled through
the pool and its used timestamp is refreshed continuously. When request
rate is high enough that pool_size / request_rate < expire, no
connection ever becomes idle and expire never fires. This prevents
CoreDNS from opening new connections to upstreams that scale out (e.g.
new Kubernetes pods behind a ClusterIP service with conntrack pinning).

max_age addresses this by enforcing an absolute upper bound on
connection lifetime regardless of activity:
- persistConn gains a created field set at dial time.
- Transport gains maxAge (default 0 = unlimited, preserving existing
  behaviour).
- Dial(): rejects cached connections whose creation age exceeds max_age.
- cleanup(): when maxAge > 0, uses a linear scan over both idle-timeout
  and max-age predicates; when maxAge == 0, preserves the original
  binary-search path on used time (sorted by FIFO insertion order).
- Both hot paths pre-compute the deadline outside any inner loop to
  avoid repeated time.Now() calls.

Tests added:
- TestMaxAgeExpireByCreation: connection with old created but fresh used
  must be rejected even though idle-timeout would pass.
- TestMaxAgeFIFORotation: three FIFO-rotated connections (old created,
  fresh used) must all be rejected, confirming that continuous rotation
  cannot prevent max-age expiry.

Signed-off-by: cangming <cangming@cangming.app>
Expose Transport.SetMaxAge through the forward plugin so operators can
set an absolute upper bound on connection lifetime via the Corefile.

Usage:
  forward . 1.2.3.4 {
      max_age 30s
  }

Default is 0 (unlimited), which preserves existing behaviour.
A positive value causes connections older than max_age to be closed and
re-dialled on the next request, ensuring CoreDNS reconnects to newly
scaled-out upstream pods even under sustained load where the idle
timeout (expire) would never fire.

If max_age is set, it must not be less than expire; the parser rejects
this combination at startup with a clear error message.

Signed-off-by: cangming <cangming@cangming.app>
@yongtang yongtang merged commit 500707c into coredns:master Mar 9, 2026
11 checks passed
yongtang pushed a commit to yongtang/coredns that referenced this pull request Mar 18, 2026
…lifetime (coredns#7903)

* plugin/pkg/proxy: add max_age for per-connection lifetime cap

Introduce a max_age setting on Transport that closes connections based
on creation time, independent of idle-timeout (expire).

Background: PR coredns#7790 changed the connection pool from LIFO to FIFO for
source-port diversity. Under FIFO, every connection is cycled through
the pool and its used timestamp is refreshed continuously. When request
rate is high enough that pool_size / request_rate < expire, no
connection ever becomes idle and expire never fires. This prevents
CoreDNS from opening new connections to upstreams that scale out (e.g.
new Kubernetes pods behind a ClusterIP service with conntrack pinning).

max_age addresses this by enforcing an absolute upper bound on
connection lifetime regardless of activity:
- persistConn gains a created field set at dial time.
- Transport gains maxAge (default 0 = unlimited, preserving existing
  behaviour).
- Dial(): rejects cached connections whose creation age exceeds max_age.
- cleanup(): when maxAge > 0, uses a linear scan over both idle-timeout
  and max-age predicates; when maxAge == 0, preserves the original
  binary-search path on used time (sorted by FIFO insertion order).
- Both hot paths pre-compute the deadline outside any inner loop to
  avoid repeated time.Now() calls.

Tests added:
- TestMaxAgeExpireByCreation: connection with old created but fresh used
  must be rejected even though idle-timeout would pass.
- TestMaxAgeFIFORotation: three FIFO-rotated connections (old created,
  fresh used) must all be rejected, confirming that continuous rotation
  cannot prevent max-age expiry.

Signed-off-by: cangming <cangming@cangming.app>

* plugin/forward: add max_age option

Expose Transport.SetMaxAge through the forward plugin so operators can
set an absolute upper bound on connection lifetime via the Corefile.

Usage:
  forward . 1.2.3.4 {
      max_age 30s
  }

Default is 0 (unlimited), which preserves existing behaviour.
A positive value causes connections older than max_age to be closed and
re-dialled on the next request, ensuring CoreDNS reconnects to newly
scaled-out upstream pods even under sustained load where the idle
timeout (expire) would never fire.

If max_age is set, it must not be less than expire; the parser rejects
this combination at startup with a clear error message.

Signed-off-by: cangming <cangming@cangming.app>

---------

Signed-off-by: cangming <cangming@cangming.app>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants