fix: exclude json.JSONDecodeError from local validation error check#14366
Closed
aj-nt wants to merge 1 commit into
Closed
fix: exclude json.JSONDecodeError from local validation error check#14366aj-nt wants to merge 1 commit into
aj-nt wants to merge 1 commit into
Conversation
625726a to
6e0c7c3
Compare
Collaborator
This was referenced Apr 23, 2026
6e0c7c3 to
91e3dca
Compare
…ion error check json.JSONDecodeError inherits from ValueError. When the OpenAI SDK fails to parse a provider response (empty body, truncated JSON), it raises json.JSONDecodeError. The is_local_validation_error check in run_agent.py catches it as isinstance(api_error, ValueError) and triggers a non-retryable abort -- but the error is transient (provider-side), not a programming bug. Similarly, UnicodeDecodeError (also ValueError subclass) from garbled provider responses was incorrectly flagged. The existing code only excluded UnicodeEncodeError. The error classifier already correctly returns retryable=True for these errors; the bug was the isinstance check overriding the classifier. Fix: Exclude json.JSONDecodeError and UnicodeError (parent of Encode/Decode/Translate) from the is_local_validation_error check, so they fall through to the classifier's retryable path. Closes NousResearch#14271 Co-authored-by: AJ <aj@users.noreply.github.com>
91e3dca to
cc92a82
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Closes #14271
json.JSONDecodeErrorinherits fromValueError. When the OpenAI SDK fails to parse a provider response (empty body, truncated JSON, garbled bytes), it raisesjson.JSONDecodeError. Theis_local_validation_errorcheck inrun_agent.pycatches it asisinstance(api_error, ValueError)and triggers a non-retryable abort — but the error is transient (provider-side), not a programming bug.The error classifier (
agent/error_classifier.py) already correctly returnsFailoverReason.unknown, retryable=Truefor these errors. The bug is the inlineisinstancecheck that overrides the classifier.Secondary bug found
UnicodeDecodeError(also aValueErrorsubclass) from garbled provider responses gets the same wrong treatment. The existing code only excludedUnicodeEncodeErrorbut a garbled response body can also causeUnicodeDecodeError.Fix
Updated
is_local_validation_errorpredicate inrun_agent.py:Before (upstream had only):
After (this PR, merged with upstream ssl.SSLError fix from #14367):
Three exclusions from the
ValueErrorcatch:UnicodeError(parent class) coversUnicodeEncodeError,UnicodeDecodeError, andUnicodeTranslateError— all indicate provider-side transport issuesjson.JSONDecodeError— malformed/truncated provider response bodyssl.SSLError— TLS transport failure (inherits ValueError via Python MRO, cherry-picked from upstream ssl.SSLCertVerificationError misclassified as local validation error #14367 / commit 4e27e49)Testing
9 tests in
tests/test_json_decode_error_misclassification.py:json.JSONDecodeError,UnicodeDecodeError,UnicodeTranslateErrormust NOT be classified as local validation errors; classifier must returnretryable=TrueValueError,TypeError,UnicodeEncodeError, andssl.SSLErrorstill correctly excludedCOUPLING note in both production and test file — update both if changing the predicate.
Rebase notes
Rebased onto
origin/main(commit 4fade39). Conflict inrun_agent.pyresolved by merging ourUnicodeErrorbroadening with upstream'sssl.SSLErrorexclusion — both are complementary fixes for the same class of bug (ValueError subclasses from provider transport, not local code).Impact
Low risk, high value. The change only affects the
is_local_validation_errorboolean — the classifier pipeline and all other recovery paths are untouched. GenuineValueError/TypeErrorfrom coding bugs still trigger non-retryable abort. Only provider-originatedValueErrorsubclasses now correctly fall through to the retryable recovery path.