Skip to content

Context cancelled not treated as transient, causing unintended Envoy-Proxy recreation #6849

@TomerJLevy

Description

@TomerJLevy

Description:

We encountered an unexpected recreation of the Envoy-Proxy deployment, which caused traffic disruption. After investigating, we found that a network issue triggered Envoy-Gateway to recreate the Envoy-Proxy deployment.

The logs indicate that while processingGateways, the system received a context cancelled error. This error was not handled as a transient error, which caused the remaining logic to incorrectly assume that zero Gateways existed, and eventually led to the deletion of the Envoy-Proxy deployment.

Expected: These context errors (deadline exceeded or context cancelled) should be treated as transient, allowing retry.
Actual: They are treated as non-transient, which may cause unexpected failures—such as the issue we experienced and the one reported here.

Repro steps:

  1. Send a context with a tight timeout or manually cancel the request context.
  2. Observe that the context returns deadline exceeded or context cancelled.
  3. The system doesn't classify this as a transient error and continuing with a corrupted truth.

Environment:

EG v1.4.2

Logs:

An error correctly marked as a transient error:

2025-08-25T19:34:24.009Z ERROR provider kubernetes/controller.go:295 transient error processing gateways {"runner": "provider", "gatewayClass": "eg", "error": "failed to list : etcdserver: leader changed"}

Examples of errors not marked as transient:

2025-08-25T14:47:35.097Z ERROR provider kubernetes/controller.go:298 failed processGateways for gatewayClass eg, skipping it {"runner": "provider", "error": "failed to list : client rate limiter Wait returned an error: context canceled"}
2025-08-25T14:47:31.632Z ERROR provider kubernetes/controller.go:298 failed processGateways for gatewayClass eg, skipping it {"runner": "provider", "error": "failed to list : context canceled"}

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions