Skip to content

[go-fan] Go Module Review: itchyny/gojqΒ #2985

@github-actions

Description

@github-actions

🐹 Go Fan Report: itchyny/gojq

Module Overview

github.com/itchyny/gojq is a pure Go implementation of jq β€” the lightweight, flexible JSON processor. Unlike shelling out to the jq binary, gojq embeds the full jq engine as a library, enabling safe Go-native JSON transformation with context cancellation, custom functions, and arbitrary-precision integers.

Version in use: v0.12.18 βœ… (latest)


Current Usage in gh-aw-mcpg

The module is used exclusively in the jq schema middleware that intercepts large MCP tool responses and replaces them with compact schema + preview + filesystem path metadata.

  • Files: 2 production files + 2 test files (all in internal/middleware/)
  • Import Count: 2 production imports (jqschema.go, jqschema_bench_test.go)
  • Key APIs Used:
    • gojq.Parse() β€” parse the schema filter string
    • gojq.Compile() β€” compile to bytecode at startup
    • (*gojq.Code).RunWithContext() β€” execute with timeout/cancellation
    • iter.Next() β€” consume results
    • *gojq.HaltError β€” distinguish clean halts from errors

The middleware applies a custom jq walk filter that recursively replaces JSON values with their type names (e.g., "test" β†’ "string", 42 β†’ "number"), keeping only the first array element for schema inference.


Research Findings

What's Done Well πŸŽ‰

The usage is already quite idiomatic and makes use of key gojq best practices:

  1. Pre-compile at init() β€” The filter is parsed and compiled once at startup, giving 10–100Γ— speedup vs re-parsing per request (documented with benchmarks in jqschema_bench_test.go).
  2. Context-aware execution β€” Uses RunWithContext with a 5-second default timeout, preventing hangs on malformed queries or deeply nested payloads.
  3. HaltError handling β€” Correctly distinguishes HaltError{Value: nil} (clean halt) from error halts, including exit code reporting.
  4. Thread-safe usage β€” *gojq.Code is safe for concurrent use; the pre-compiled global is correctly shared across goroutines.

Recent Updates (v0.12.18)

  • Array element limit raised to 2^29 (536,870,912) elements
  • Improved concurrent execution performance (directly benefits the pre-compiled global pattern)
  • Enhanced type error messages (referenced in HaltError handling comments)

Best Practices from Maintainers

  • Compile once with gojq.Compile, run many times β€” βœ… already done
  • Use RunWithContext for cancellation β€” βœ… already done
  • Use gojq.WithFunctions for Go-native extensions β€” πŸ”² not yet leveraged
  • Use gojq.WithVariables for parameterized filters β€” πŸ”² not yet leveraged

Improvement Opportunities

πŸƒ Quick Wins

1. Deduplicate identical test helper functions

payloadMetadataToMap in jqschema_test.go and integrationPayloadMetadataToMap in jqschema_integration_test.go are byte-for-byte identical functions living in the same package middleware. One can be removed and tests updated to use the single shared version.

2. UTF-8-safe preview truncation

// Current β€” slices at byte boundary
preview = string(payloadJSON[:PayloadPreviewSize]) + "..."

Since JSON payload bytes are virtually always ASCII-compatible (gojq outputs \uXXXX for non-ASCII), this is safe in practice. But a comment or a utf8-aware boundary check (using strings.LastIndexByte or utf8.ValidString) would make the intent explicit and defensive.

3. Document why the iterator is consumed only once

After iter.Next(), the iterator is not drained. This is correct (gojq is lazy, the filter produces exactly one result), but a brief comment explaining the single-result contract would help future maintainers.


✨ Feature Opportunities

1. gojq.WithFunctions for richer schema inference

Currently, all numeric values map to "number". Using custom Go functions, the schema could distinguish integers from floats:

gojq.Compile(query, gojq.WithFunctions("isinteger", 0, 0, func(v interface{}) interface{} {
    switch v.(type) {
    case int, int64, gojq.Number:
        return true
    }
    return false
}))

This would enable schemas like {"id": "integer", "price": "float"} β€” more useful for downstream consumers.

2. Schema merging for heterogeneous arrays

The current filter takes only the first element of each array for schema inference:

elif type == "array" then
  if length == 0 then [] else [.[0] | walk(f)] end

For heterogeneous arrays (mixed types), this misses variation. A reduce-based merge would produce more comprehensive type signatures. Worth considering for deeply varied API responses.

3. gojq.WithVariables for a configurable filter

Hard-coded constants like array depth or preview behavior could be exposed as jq variables, enabling runtime configuration without recompilation:

gojq.Compile(query, gojq.WithVariables([]string{"$maxArrayElements"}))
// then:
jqSchemaCode.RunWithContext(ctx, data, maxArrayElements)

πŸ“ Best Practice Alignment

1. Explain the custom walk vs built-in

The code defines its own walk function rather than using gojq's built-in walk(f). This is intentional and correct β€” the custom walk replaces leaf values with type names AND collapses arrays to one element β€” behaviors incompatible with standard walk(f) semantics. A comment in the filter string explaining why the custom implementation is needed would prevent well-meaning future simplifications that would break functionality.

2. Expose compile error for health checks

jqSchemaCompileErr is unexported. A small exported HealthCheck() error (or inclusion in the gateway's /health endpoint) would surface this startup failure to monitoring systems before the first tool call fails.


πŸ”§ General Improvements

1. Avoid double JSON round-trip for native Go types

In WrapToolHandler, the code does:

payloadJSON, _ := json.Marshal(data)        // struct β†’ JSON bytes
// ...
json.Unmarshal(payloadJSON, &jsonData)      // JSON bytes β†’ map[string]interface{}

If data is already map[string]interface{} or []interface{}, the unmarshal is redundant. A type-switch before the unmarshal step could skip it for native types, saving allocations on hot paths.


Recommendations

Priority Action
🟒 Low Remove duplicate integrationPayloadMetadataToMap test helper
🟒 Low Add comment explaining custom walk vs built-in
🟑 Medium Add UTF-8 boundary comment/assertion for preview truncation
🟑 Medium Type-switch optimization in WrapToolHandler to skip redundant unmarshal
πŸ”΅ Future gojq.WithFunctions for integer/float distinction in schema
πŸ”΅ Future Schema merging across heterogeneous array elements
πŸ”΅ Future Expose jqSchemaCompileErr via health check endpoint

Next Steps

  • Remove integrationPayloadMetadataToMap and use the shared payloadMetadataToMap across both test files
  • Add a comment block in jqSchemaFilter explaining why a custom walk is used instead of the built-in
  • Consider a gojq.WithFunctions-based approach for richer schema typing in a future enhancement

Generated by Go Fan 🐹
Module summary saved to: specs/mods/gojq.md (in cache)
Run ID: Β§23837524610

Note

πŸ”’ Integrity filter blocked 12 items

The following items were blocked because they don't meet the GitHub integrity level.

To allow these resources, lower min-integrity in your GitHub frontmatter:

tools:
  github:
    min-integrity: approved  # merged | approved | unapproved | none

Generated by Go Fan Β· β—·

  • expires on Apr 8, 2026, 7:44 AM UTC

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions