Skip to content

ssl.SSLCertVerificationError misclassified as local validation error #14367

@aj-nt

Description

@aj-nt

Bug

ssl.SSLCertVerificationError inherits from both OSError and ValueError (via Python MRO). The is_local_validation_error check in run_agent.py (~line 10691) catches it as isinstance(api_error, ValueError) and triggers a non-retryable abort — but a cert verification failure from a provider connection is a transport issue, not a local programming bug.

The error classifier correctly maps SSLCertVerificationError to FailoverReason.timeout, retryable=True (its type name is in _TRANSPORT_ERROR_TYPES), but the inline isinstance check overrides that.

Reproduction

import ssl
e = ssl.SSLCertVerificationError("certificate verify failed")
# is_local_validation_error logic:
isinstance(e, (ValueError, TypeError))  # True — it IS a ValueError via MRO
# So it gets flagged as non-retryable, even though classifier says retryable

Fix

Add ssl.SSLCertVerificationError (or its parent ssl.SSLError) to the exclusion list in is_local_validation_error, alongside the existing UnicodeError and json.JSONDecodeError exclusions from #14366.

Severity

Low. Cert errors rarely reach the API response handler — they are typically caught at the connection layer before the except Exception as api_error block. But if a provider returns a cert error mid-stream (e.g., TLS renegotiation failure after the HTTP request is sent), this would cause an unnecessary abort.

Pre-existing — not introduced by #14366. Discovered during red-team QA of that fix.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existscomp/agentCore agent loop, run_agent.py, prompt buildertype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions