Skip to content

fix(core): remove custom JSON field scanner#201

Merged
SantiagoDePolonia merged 1 commit intomainfrom
fix/json-unknown-fields-parser
Apr 1, 2026
Merged

fix(core): remove custom JSON field scanner#201
SantiagoDePolonia merged 1 commit intomainfrom
fix/json-unknown-fields-parser

Conversation

@SantiagoDePolonia
Copy link
Copy Markdown
Contributor

@SantiagoDePolonia SantiagoDePolonia commented Apr 1, 2026

Summary

  • replace the custom byte-by-byte JSON scanner used for unknown field preservation
  • preserve unknown top-level JSON members via raw gjson iteration and remove redundant scanner helpers
  • add regression coverage for invalid JSON and duplicate unknown keys

Testing

  • go test ./internal/core/...
  • go test ./tests/perf -run TestHotPathPerfGuard
  • go test ./... # existing unrelated failure in internal/usage: TestSQLiteStoreCleanup_KeepsNewerLegacyOffsetRows

Summary by CodeRabbit

  • Bug Fixes

    • Improved JSON validation to properly reject malformed input with explicit error messages.
    • Enhanced handling of duplicate unknown fields in JSON objects, ensuring they are correctly preserved and accessible.
  • Tests

    • Updated test coverage for JSON field extraction with additional validation for edge cases.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 1, 2026

📝 Walkthrough

Walkthrough

Refactored JSON unknown-fields extraction to replace custom byte-scanning with the gjson library, improving code clarity and maintainability. Updated tests to reflect the new implementation and removed obsolete scanning helpers.

Changes

Cohort / File(s) Summary
JSON Fields Extraction Refactoring
internal/core/json_fields.go
Replaced extractUnknownJSONFieldsObjectByScan with gjson-based extractUnknownJSONFields. Added unknownJSONFieldsFromMap helper with optional value cloning. Updated UnknownJSONFields.Lookup to call dec.Token() directly. Removed custom byte-scanning and validation helpers.
Test Updates
internal/core/json_fields_test.go
Renamed test coverage from extractUnknownJSONFieldsObjectByScan to extractUnknownJSONFields. Added new test for duplicate unknown key preservation. Extended invalid-syntax coverage with trailing data rejection case.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Scanning bytes was quite a chore,
Now gjson opens logic's door,
Less custom code, more clarity—
JSON parsing's legendary!
Hop along, refactor's done,
Cleaner code means more fun! 🎉

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 9.09% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(core): remove custom JSON field scanner' accurately describes the main change: replacing a custom byte-scanning JSON implementation with a gjson-based one and removing the old scanner code.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/json-unknown-fields-parser

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/core/json_fields_test.go`:
- Around line 59-72: Add a brief comment above
TestExtractUnknownJSONFields_PreservesDuplicateUnknownKeys (and/or near
extractUnknownJSONFields) stating that preserving duplicate JSON object keys is
an intentional design choice for passthrough/round-trip of provider payloads
despite JSON RFC guidance; mention that this behavior ensures raw (fields.raw)
retains duplicates and Lookup returns the first occurrence. Keep the note
concise and reference the rationale (round-tripping provider payloads) so future
maintainers know this is deliberate rather than accidental.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6ec39f31-8c8f-42d1-8676-e9d84a9ed91a

📥 Commits

Reviewing files that changed from the base of the PR and between 0ebac81 and 93a5422.

📒 Files selected for processing (2)
  • internal/core/json_fields.go
  • internal/core/json_fields_test.go

Comment on lines +59 to +72
func TestExtractUnknownJSONFields_PreservesDuplicateUnknownKeys(t *testing.T) {
data := []byte(`{"known":"value","x_meta":1,"x_meta":2}`)

fields, err := extractUnknownJSONFields(data, "known")
if err != nil {
t.Fatalf("extractUnknownJSONFields() error = %v", err)
}
if got := string(fields.raw); got != `{"x_meta":1,"x_meta":2}` {
t.Fatalf("raw = %s, want duplicate keys preserved", got)
}
if got := fields.Lookup("x_meta"); !bytes.Equal(got, []byte("1")) {
t.Fatalf("Lookup(x_meta) = %s, want first duplicate value", got)
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Document that duplicate-key preservation is intentional for passthrough scenarios.

JSON with duplicate keys has undefined behavior per RFC 7159 (§4: "names within an object SHOULD be unique"). This test asserts preservation which is valuable for round-tripping provider payloads, but a brief comment explaining the design choice would help future maintainers understand this isn't accidental tolerance.

📝 Suggested documentation comment
 func TestExtractUnknownJSONFields_PreservesDuplicateUnknownKeys(t *testing.T) {
+	// JSON with duplicate keys is technically non-conformant (RFC 7159 §4), but
+	// we preserve them to faithfully round-trip provider payloads that may
+	// contain such quirks. Lookup returns the first occurrence.
 	data := []byte(`{"known":"value","x_meta":1,"x_meta":2}`)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func TestExtractUnknownJSONFields_PreservesDuplicateUnknownKeys(t *testing.T) {
data := []byte(`{"known":"value","x_meta":1,"x_meta":2}`)
fields, err := extractUnknownJSONFields(data, "known")
if err != nil {
t.Fatalf("extractUnknownJSONFields() error = %v", err)
}
if got := string(fields.raw); got != `{"x_meta":1,"x_meta":2}` {
t.Fatalf("raw = %s, want duplicate keys preserved", got)
}
if got := fields.Lookup("x_meta"); !bytes.Equal(got, []byte("1")) {
t.Fatalf("Lookup(x_meta) = %s, want first duplicate value", got)
}
}
func TestExtractUnknownJSONFields_PreservesDuplicateUnknownKeys(t *testing.T) {
// JSON with duplicate keys is technically non-conformant (RFC 7159 §4), but
// we preserve them to faithfully round-trip provider payloads that may
// contain such quirks. Lookup returns the first occurrence.
data := []byte(`{"known":"value","x_meta":1,"x_meta":2}`)
fields, err := extractUnknownJSONFields(data, "known")
if err != nil {
t.Fatalf("extractUnknownJSONFields() error = %v", err)
}
if got := string(fields.raw); got != `{"x_meta":1,"x_meta":2}` {
t.Fatalf("raw = %s, want duplicate keys preserved", got)
}
if got := fields.Lookup("x_meta"); !bytes.Equal(got, []byte("1")) {
t.Fatalf("Lookup(x_meta) = %s, want first duplicate value", got)
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/json_fields_test.go` around lines 59 - 72, Add a brief comment
above TestExtractUnknownJSONFields_PreservesDuplicateUnknownKeys (and/or near
extractUnknownJSONFields) stating that preserving duplicate JSON object keys is
an intentional design choice for passthrough/round-trip of provider payloads
despite JSON RFC guidance; mention that this behavior ensures raw (fields.raw)
retains duplicates and Lookup returns the first occurrence. Keep the note
concise and reference the rationale (round-tripping provider payloads) so future
maintainers know this is deliberate rather than accidental.

@SantiagoDePolonia SantiagoDePolonia merged commit 2e67888 into main Apr 1, 2026
16 checks passed
@SantiagoDePolonia SantiagoDePolonia deleted the fix/json-unknown-fields-parser branch April 1, 2026 21:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant