Skip to content

[Bug] validate-issue-structure body extractor truncates --body "..." at the first embedded double-quote #227

@Ref34t

Description

@Ref34t

Given / When / Then

Given an agent files an issue via:

gh issue create --repo owner/repo \
  --title "[Chore] …" \
  --body "$(cat <<'EOF'
## Driver
Some prose.

## Scope
- A bullet describing an admin notice ("Configure X to unlock Y")
- Another bullet mentioning the post is marked "needs retry"

## Acceptance Criteria
- [ ] Some criterion
- [ ] Another criterion
EOF
)"

…where the body contains one or more embedded " characters inside the value (admin-notice strings, status labels in quotes, prose like "current state" etc.).

When validate-issue-structure.sh runs its extract_flag_value '--body|-b' "$COMMAND" against the literal command string — using the awk regex (--body|-b)[[:space:]]+"([^"]*)" (hook file line 71) —

Then the extractor terminates the match at the FIRST embedded " inside the body value rather than at the closing " of the --body argument. The truncated body is missing any ## heading that occurs past the truncation point. The hook then exits 2 with missing section: ## Acceptance Criteria (or whichever section follows the truncation), even though the section is present in the real body that gh will receive.

The error is a false positivegh sees the full body (shell does the quoting correctly via the $(cat <<'EOF' ... EOF) substitution); only the hook's static-string regex sees the truncation.

Repro

  1. Use any apexyard fork at framework version that ships .claude/hooks/validate-issue-structure.sh (current me2resh/apexyard:main).

  2. Run:

    gh issue create --repo owner/repo \
      --title "[Chore] Embedded-quote repro" \
      --body "$(cat <<'EOF'
    ## Driver
    Test of the body extractor.
    
    ## Scope
    - Mention an admin notice that says "do the thing"
    
    ## Acceptance Criteria
    - [ ] Whatever
    EOF
    )"
  3. Expected: hook validates the body (which contains all three required sections), exits 0, gh files the issue.

  4. Actual: hook exits 2 with:

    BLOCKED: issue body for '[Chore]' does not match the required schema.
    
    Problems:
      - missing section: ## Acceptance Criteria
    

    The ## Scope heading already lives past the first embedded ", so the extractor returns a buffer that contains only ## Driver plus a few words of ## Scope. Sections after the first " are invisible to the hook.

Workaround (current)

Use --body-file <path> instead of --body "…". The hook's --body-file branch (lines 103–123) reads the file via cat, so the awk extraction issue doesn't apply. Validation still runs correctly against the file content. Confirmed working on a real 12-ticket batch.

Affected files

  • .claude/hooks/validate-issue-structure.sh (the primary site)
  • .claude/hooks/block-private-refs-in-public-repos.sh — comment at line 49 of validate-issue-structure.sh says the extractor was "lifted from block-private-refs-in-public-repos.sh", so it is likely affected by the same truncation when scanning bodies that contain embedded ". Worth checking — a quote-truncated body could let private refs in the back half of the body slip past the leak gate.
  • .claude/hooks/verify-commit-refs.sh and validate-pr-create.sh — same pattern, worth a quick audit even if they happen not to use the double-quoted shape today.

Suggested fix direction

(Not prescriptive — leaving the implementation to the agent that picks this up.)

Three plausible approaches, increasing in ambition:

  1. Greedy double-quoted match: change the regex from "([^"]*)" to a greedy form that anchors on the last " before a known terminator (whitespace + next --flag or end-of-string). Cheaper, still string-based, still wrong on adversarial inputs.
  2. Skip the static-string regex entirely: rely on --body-file being the canonical path and prompt agents toward it in skill docs. Doesn't fix the case where the agent uses --body inline.
  3. Shell-aware parsing: invoke bash -c 'printf "%s\n" "$1" "$2" …' style with the same argv gh would see, parsing argv server-side rather than the raw command string. Most robust, biggest change. Probably overkill for this hook.

Related

  • The hook's docstring (lines 36–37) lists "Section matching is case-insensitive, whitespace-tolerant" — the case-insensitive matching is fine; the problem is upstream of section matching, in body extraction.
  • The hook is silent-success on --body-file (which is correct, no regression there).

Repro confidence

High — hit on a 12-ticket batch this session, every ticket whose body contained an embedded " failed identically. Tickets whose bodies had no embedded " passed. Switched to --body-file and all subsequent tickets passed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions