Skip to content

fix(chainlib): tolerate non-scalar JSON-RPC request ids in id validation#2302

Merged
nimrod-teich merged 2 commits into
mainfrom
fix/jsonrpc-nonscalar-id-validation
Jun 2, 2026
Merged

fix(chainlib): tolerate non-scalar JSON-RPC request ids in id validation#2302
nimrod-teich merged 2 commits into
mainfrom
fix/jsonrpc-nonscalar-id-validation

Conversation

@AnnaR-prog

Copy link
Copy Markdown
Contributor

Problem

ValidateRequestAndResponseIds parsed the request id (via IdFromRawMessage, which only accepts string/number/null) before reaching its existing "null/empty response id" carve-out. So when a client sends a non-scalar id (object/array) — invalid per JSON-RPC 2.0 — lavap hard-fails with failed parsing ID …, which surfaces to the consumer as jsonRPC ID mismatch error → failed relay, insufficient results and triggers a cross-provider retry storm, even when the relay itself was fine.

Observed in production on Polygon: a .NET client serialized a CancellationToken into the id:

{"IsCancellationRequested":false,"CanBeCanceled":false,"WaitHandle":{"Handle":{"value":6404},...}}

Captured against live nodes, the two react differently — and both were being failed:

  • Polygon rejects the object id and replies id:null (correct per spec).
  • NEAR accepts it, returns a valid result, and echoes the object id back — which lavap then discarded.

This is chain-agnostic: the same failure reproduces on any chain (confirmed by sending an object id to both gateways).

Fix

  • Reorder: evaluate the null/empty/[] response-id carve-out first, before parsing the request id (covers Polygon's id:null rejection — the node's own invalid request now passes through instead of a retry storm).
  • Semantic fallback: when either id is non-scalar, compare the ids by value (tolerant of whitespace / object key order) instead of erroring (covers NEAR's verbatim echo, preserving the valid response). A genuinely different id is still reported as a mismatch.
  • Scalar-id behaviour is unchanged.

The root cause is a non-compliant client (an object id violates JSON-RPC 2.0); this change makes lavap surface the node's real response instead of masking it behind an internal "ID mismatch".

Verification

  • New regression test TestValidateRequestAndResponseIds_NonScalarId: proven to fail without the change with the exact production error (failed parsing ID … not a string or float (id: {…CancellationToken…})) and pass with it. Covers Polygon (object→null), NEAR (object echo), whitespace tolerance, genuine object mismatch, and unchanged scalar cases.
  • Full chainlib suite + go build / go vet / gofmt clean.

Relationship to #2301

Separate root cause from #2301 (NEAR UNKNOWN_BLOCK classification). This PR addresses the Polygon-reported failure; the two are independent and touch different code paths.

🤖 Generated with Claude Code

ValidateRequestAndResponseIds parsed the request id before reaching the
"null/empty response id" carve-out. A client that sends a non-scalar id
(object/array) — invalid per JSON-RPC 2.0, e.g. a .NET CancellationToken
serialized into the id — therefore hard-failed with "failed parsing ID",
which lavap surfaces as "jsonRPC ID mismatch / insufficient results" plus a
cross-provider retry storm, even when the relay itself was fine.

Captured against live nodes: Polygon rejects the object id and replies
id:null; NEAR accepts it, returns a valid result, and echoes the object id
back. Both were being failed.

- Check the null/empty/[] response-id carve-out first, before parsing the
  request id (covers Polygon's id:null rejection).
- When either id is non-scalar, compare ids semantically (whitespace/key-order
  tolerant) instead of erroring (covers NEAR's verbatim echo, preserving the
  valid response). A genuinely different id is still reported as a mismatch.

Scalar id behaviour is unchanged. Root cause is a non-compliant client; this
makes lavap surface the node's real response instead of masking it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@qodo-code-review

Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

avitenzer
avitenzer previously approved these changes May 26, 2026
Drives the real ParseMsg -> SendNodeMsg path against a mock node with an
object request id, covering both observed node behaviours: a NEAR-like node
that echoes the id and returns a valid result (must be returned, not
discarded), and a Polygon-like node that rejects with id:null + "invalid
request" (must pass through, not become an "ID mismatch" / retry storm).

Complements the ValidateRequestAndResponseIds unit test by asserting the full
relay outcome.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov

codecov Bot commented May 26, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 81.81818% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
protocol/chainlib/node_error_handler.go 81.81% 1 Missing and 1 partial ⚠️
Flag Coverage Δ
consensus 8.96% <ø> (ø)
protocol 35.60% <81.81%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
protocol/chainlib/node_error_handler.go 71.58% <81.81%> (+4.16%) ⬆️

... and 3 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions

Copy link
Copy Markdown

Test Results

0 tests  ±0   0 ✅ ±0   0s ⏱️ ±0s
0 suites ±0   0 💤 ±0 
7 files   ±0   0 ❌ ±0 

Results for commit 6dc7071. ± Comparison against base commit 8fae505.

@nimrod-teich nimrod-teich merged commit 99f113c into main Jun 2, 2026
30 checks passed
@nimrod-teich nimrod-teich deleted the fix/jsonrpc-nonscalar-id-validation branch June 2, 2026 12:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants