Skip to content

[go-fan] Go Module Review: itchyny/gojqΒ #2147

@github-actions

Description

@github-actions

🐹 Go Fan Report: itchyny/gojq

Module Overview

github.com/itchyny/gojq (v0.12.18) is a pure Go implementation of the jq JSON processor. It provides a programmatic API β€” parse, compile, and run jq queries entirely in-process, with no CGO and no subprocess spawning. In gh-aw-mcpg it powers the jqschema middleware that transforms large MCP tool-response payloads into schema summaries, enabling agents to understand response structure without consuming an entire large payload.

Current Usage in gh-aw

The module is used exclusively in internal/middleware/jqschema.go and its benchmark test.

  • Files: 2 files (jqschema.go, jqschema_bench_test.go)
  • Import Count: 2
  • Key APIs Used:
    • gojq.Parse() β€” parse jq query string into AST
    • gojq.Compile() β€” compile parsed AST into a reusable *gojq.Code
    • code.RunWithContext(ctx, input) β€” execute query with context cancellation
    • iter.Next() β€” retrieve result from iterator
    • *gojq.HaltError β€” type-assert and inspect jq halt conditions

The integration is well-structured: the schema filter is compiled once at init() and reused for all requests, which is the canonical high-performance gojq pattern. Context-based timeouts (5-second default) are correctly applied. The HaltError type is properly handled with nil-value vs. non-nil-value differentiation.

Research Findings

How it works

The middleware applies this jq filter to recursively walk a JSON payload and replace every leaf value with its type name ("string", "number", "boolean", "null"). Arrays are collapsed to a single-element schema. The result is a compact structural schema clients can use to understand response shape before fetching the full payload from disk.

def walk(f):
  . as $in |
  if type == "object" then
    reduce keys[] as $k ({}; . + {($k): ($in[$k] | walk(f))})
  elif type == "array" then
    if length == 0 then [] else [.[0] | walk(f)] end
  else
    type
  end;
walk(.)

Best Practices (from gojq maintainers)

  • Compile once, run many βœ… β€” already done
  • Use RunWithContext βœ… β€” already done
  • Use gojq.WithFunction for performance-critical paths β€” not yet explored
  • Drain iterators β€” not done (minor gap)

Improvement Opportunities

πŸƒ Quick Wins

  1. truncated field missing from PayloadMetadata β€” The truncated boolean is computed (line 362) and used for logging, but never returned to clients. Adding it to PayloadMetadata gives agents a reliable signal that payloadPreview was cut:

    type PayloadMetadata struct {
        // ... existing fields ...
        Truncated bool `json:"truncated"`
    }
  2. Unused f parameter in walk(f) β€” The jq filter defines walk(f) with an f arg that is passed recursively but never applied to leaf nodes (leaf nodes always return type). This deviates silently from standard jq's walk(f) semantics. Renaming to def schemaWalk: (no args) removes ambiguity, or at minimum a comment should clarify the intentional deviation.

  3. Stale middleware README β€” internal/middleware/README.md documents the old payload path as /tmp/gh-awmg/tools-calls/{randomID}/payload.json. The actual layout since the session-isolation refactor is {baseDir}/{sessionID}/{queryID}/payload.json. Should be updated to match reality.

✨ Feature Opportunities

  1. Native Go schema function via gojq.WithFunction β€” gojq supports registering Go functions callable from jq filters. The schema transformation (walk + type replacement) could be implemented as a registered Go function, bypassing jq interpretation overhead for this specific case. This would make the hot path a direct Go tree-walk, potentially 5–20Γ— faster for deeply nested payloads:

    gojq.Compile(query, gojq.WithFunction("goschema", 0, 0,
        func(v interface{}, _ []interface{}) interface{} {
            return inferSchema(v)
        }))
  2. Drain the iterator β€” After iter.Next() returns the schema, the iterator is abandoned. While the current filter always produces exactly one output, a defensive drain loop (for { if _, ok := iter.Next(); !ok { break } }) is good practice and protects against behavior changes if the filter is modified.

πŸ“ Best Practice Alignment

  1. Double JSON round-trip explanation β€” In WrapToolHandler, data interface{} is marshaled to JSON bytes then immediately unmarshaled back to interface{} before being passed to applyJqSchema. This is functionally correct (it normalizes Go structs to map[string]interface{} that gojq can process via its standard path), but a comment explaining the intent would prevent future readers from viewing it as a bug. The optimization of checking whether data is already a plain map[string]interface{} and skipping the round-trip could also be considered.

Recommendations (Prioritized)

Priority Item Effort
πŸ”΄ High Add truncated to PayloadMetadata (clients need this signal) ~5 min
🟑 Medium Fix stale README path documentation ~5 min
🟑 Medium Rename walk(f) β†’ schemaWalk + add comment ~10 min
🟒 Low Drain the iterator defensively ~5 min
🟒 Low Add comment explaining double JSON round-trip ~5 min
⬜ Future Explore gojq.WithFunction for native Go schema inference ~2 hours

Next Steps

  • Add Truncated bool \json:"truncated"`toPayloadMetadatastruct and populate it inWrapToolHandler`
  • Update internal/middleware/README.md with correct payload path pattern
  • Clarify or rename walk(f) in jqSchemaFilter constant
  • Add drain loop after iter.Next() in applyJqSchema

Generated by Go Fan 🐹
Module summary saved to: specs/mods/gojq.md (in cache-memory)
Run: Β§23284330145

Generated by Go Fan Β· β—·

  • expires on Mar 26, 2026, 7:32 AM UTC

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions