fix(chainlib): classify NEAR cause/data errors (UNKNOWN_BLOCK) correctly#2301
Open
AnnaR-prog wants to merge 1 commit into
Open
fix(chainlib): classify NEAR cause/data errors (UNKNOWN_BLOCK) correctly#2301AnnaR-prog wants to merge 1 commit into
AnnaR-prog wants to merge 1 commit into
Conversation
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
avitenzer
previously approved these changes
May 26, 2026
ExtractNodeErrorDetails surfaced only .error.message for classification, but NEAR carries its canonical error name in .error.cause.name (UNKNOWN_BLOCK, UNKNOWN_CHUNK, INVALID_SHARD_ID, NOT_SYNCED_YET) while .message is just "Server error". A pruned-block request to a non-archive NEART node therefore missed the Tier-2 NEAR matcher and fell back to the generic NODE_SERVER_ERROR rule (and, on builds predating the error registry, to a non-retryable "unsupported method", which suppressed failover to an archive provider — the reported QoS/relay failures). Fold .name, .cause and .data into the classification message. This affects classification and telemetry only; the node error still passes through to the user unchanged (transparent hop). Tier-2 (chain-scoped) matchers run before Tier-1, so the broadened message cannot pull a chain with its own matcher into a generic rule. Verified against the verbatim live NEART body for block_id 217272549. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
76e88c5 to
8e90ae2
Compare
Codecov Report❌ Patch coverage is
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
ExtractNodeErrorDetailsbuilds the message used for error classification from only.error.message(+.error.code). NEAR carries its canonical error name in.error.cause.name. A request for a pruned block on a non-archive NEAR node returns:{"error":{"code":-32000,"message":"Server error", "name":"HANDLER_ERROR","cause":{"name":"UNKNOWN_BLOCK"}, "data":"DB Not Found Error: BLOCK HEIGHT: ..."}}The discriminating token (
UNKNOWN_BLOCK) lives incause.namewhile.messageis just"Server error", so the Tier-2 NEAR matcher never sees it and the error falls back to the genericNODE_SERVER_ERROR. On builds predating the error registry (#2261) the same error was tagged "unsupported method" (non-retryable + zero-CU), which suppressed consumer failover to an archive provider — the QoS/relay-failure symptom reported on NEART.All four NEAR Tier-2 tokens are
cause.namevalues:UNKNOWN_BLOCK,UNKNOWN_CHUNK,INVALID_SHARD_ID,NOT_SYNCED_YET.Fix
Fold
.name,.cause, and.datainto the message used for classification + telemetry only. The node error still passes through to the user unchanged (transparent hop). Tier-2 (chain-scoped) matchers run before Tier-1, so the broadened message cannot pull a chain with its own matcher into a generic rule.Verification
TestExtractNodeErrorDetails_NEARUnknownBlock_FoldsCauseIntoClassification) uses the verbatim body captured from a live non-archive NEAR testnet node. Proven to fail without the change (NODE_SERVER_ERROR) and pass with it (CHAIN_NEAR_UNKNOWN_BLOCK, Retryable=true).chainlib,common,rpcsmartroutersuites +go vet+gofumptclean.Scope (NEAR only — Polygon tracked separately)
The original report also named Polygon. Investigation showed Polygon's failure is an unrelated JSON-RPC id-validation bug (a client sending a non-scalar
id), not a classification issue — Polygon error identity lives in.messageand already classifies correctly. That is addressed in a separate PR. This change is NEAR-only and verified against a live NEAR testnet node.🤖 Generated with Claude Code