Skip to content

fix(frontmatter): use single quotes for tags to prevent NESTED_QUOTES errors#1217

Closed
garrytan-agents wants to merge 1 commit into
garrytan:masterfrom
garrytan-agents:fix/frontmatter-nested-quotes
Closed

fix(frontmatter): use single quotes for tags to prevent NESTED_QUOTES errors#1217
garrytan-agents wants to merge 1 commit into
garrytan:masterfrom
garrytan-agents:fix/frontmatter-nested-quotes

Conversation

@garrytan-agents

Copy link
Copy Markdown
Contributor

Problem

serializeFrontmatter() uses JSON.stringify(t) for tag values, producing:

tags: ["yc", "w2025"]

This is ambiguous YAML — the double quotes nest with the implicit string context, triggering NESTED_QUOTES validation errors. In a real brain (105K+ pages), this caused 6,981 errors — the single largest contributor to the frontmatter integrity health check.

Fix

Switch to single-quoted YAML flow values by default:

tags: ['yc', 'w2025']

Falls back to double quotes only when a tag itself contains an apostrophe (e.g. "Men's Fashion").

Impact

  • One-line change in src/core/frontmatter-inference.ts
  • Prevents ~7K validation errors per brain that uses inferred frontmatter
  • Companion data fix already landed in garrytan/brain (11,819 files cleaned up)

… errors

JSON.stringify wraps tag values in double quotes, producing tags: ["yc", "w2025"]
which triggers NESTED_QUOTES validation errors (6,981 across a real brain).

Switch to single quotes by default, falling back to double quotes only when a
tag value itself contains an apostrophe (e.g. "Men's Fashion").

Before: tags: ["yc", "w2025"]  ← YAML sees nested double quotes
After:  tags: ['yc', 'w2025']    ← valid YAML flow sequence
@garrytan

Copy link
Copy Markdown
Owner

Closing in favor of a validator-side fix landing in a follow-up PR. Outside-voice review (codex) caught that the bug is the validator at src/core/markdown.ts:219, not the emitter — every flagged tag line in the 6,981-error sample is already valid YAML; the validator's count-of-quotes heuristic is what's wrong (it flags valid YAML flow sequences like tags: ["yc", "w2025"] because 4 unescaped double quotes >= the threshold). Fixing the validator (try yaml.safeLoad on the value; only flag genuinely unparseable inputs) makes the existing data on disk pass instantly without rewriting any files. Thanks for the 6,981-error signal — that's what surfaced the underlying class. See PR follow-up landing shortly.

@garrytan garrytan closed this May 20, 2026
garrytan added a commit that referenced this pull request May 21, 2026
…agging valid YAML) (#1229)

* fix(markdown): YAML-aware NESTED_QUOTES validator

The validator at src/core/markdown.ts:219-238 was a syntactic
count-of-quotes heuristic that flagged any frontmatter line with 3+
unescaped " characters. That heuristic is too dumb: valid YAML flow
sequences like `tags: ["yc", "w2025"]` and single-quoted scalars like
`title: 'a: "b" "c"'` both have 3+ unescaped " by design.

Fix: keep the count fast path, then disambiguate with js-yaml.safeLoad
on the value. Only flag lines that genuinely fail to parse. The
full-frontmatter YAML_PARSE check (check 6) still catches structural
failures.

Closes the 6,981-error class on Garry's 105K-page brain in one ~10
LOC change — existing data on disk was already valid YAML; the
validator was wrong about it. No `gbrain frontmatter generate --fix`
sweep needed.

js-yaml@3.14.2 promoted from transitive (via gray-matter) to direct
dependency. @types/js-yaml@3.12.10 added to devDependencies.

5 new YAML-aware test cases in test/markdown-validation.test.ts:
- flow sequence with quoted tags does NOT trigger (6,981 regression guard)
- single-quoted scalar with literal inner double quotes does NOT trigger
- escaped-as-'' quotes inside flow seq do NOT trigger
- genuinely broken nested quotes STILL trigger
- unclosed bracket STILL surfaces NESTED_QUOTES or YAML_PARSE

Closes PR #1217 — outside-voice (codex) review caught that the bug
was the validator, not the emitter. Original 6,981-error signal from
@garrytan-agents.

* chore: bump version and changelog (v0.40.0.1)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore: retarget version slot v0.40.0.1 -> v0.37.5.0

Same content, different slot in the version queue. v0.40.0.1 was the
queue allocator's default safe slot (bumped past PR #1128's claimed
0.40.0.0). v0.37.5.0 is a PATCH above #1228's claimed 0.37.4.0 and
sits closer to current master (0.37.1.0) in CHANGELOG ordering.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
garrytan added a commit that referenced this pull request May 21, 2026
Aligns the auto-fix engine, the inferred-frontmatter serializer, and the
agent-facing skill on a single canonical YAML shape for tag arrays. v0.37.5.0
fixed the validator (it stopped flagging valid YAML); this release lines up
everything else with that fix.

Layer 1 (brain-writer.ts step 3a): allow-listed to `tags:` / `aliases:` keys.
Rewrites `tags: ["yc"]` to `tags: ['yc']`; apostrophe fallback for
`"Men's Fashion"`. Shares a NESTED_QUOTES dedup gate with the existing
step 3 so one file with both rewrites surfaces as one audit entry, not two.

Layer 4 (frontmatter-inference.ts): serializer emits the same canonical
single-quote form by default. Inferred frontmatter on import and `--fix`
output now match byte-for-byte.

Layer 5 (frontmatter-guard SKILL.md): new "Prevention" section showing
canonical vs JSON-style arrays + the JSON.stringify trap that produces
the non-canonical form. Future agent writes start canonical.

Parity test added to markdown-validation.test.ts pinning agreement between
per-value safeLoad parsing and gray-matter full-document parse on the
load-bearing inputs.

PR #1238's "Layer 3" (put_page auto-normalization) was dropped during
plan review: put_page parses YAML into typed fields and hashes them, so
single-quoted vs double-quoted arrays are functionally identical in
storage. The fix lives where the writes happen, not on the read path.

Source PRs absorbed: #1217 (closed, serializer fix) + #1238 (closed,
four-layer defense-in-depth narrowed to three layers). PR #1229 already
merged as v0.37.5.0.

Co-Authored-By: garrytan-agents <garrytan-agents@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request May 21, 2026
…ays (#1252)

Aligns the auto-fix engine, the inferred-frontmatter serializer, and the
agent-facing skill on a single canonical YAML shape for tag arrays. v0.37.5.0
fixed the validator (it stopped flagging valid YAML); this release lines up
everything else with that fix.

Layer 1 (brain-writer.ts step 3a): allow-listed to `tags:` / `aliases:` keys.
Rewrites `tags: ["yc"]` to `tags: ['yc']`; apostrophe fallback for
`"Men's Fashion"`. Shares a NESTED_QUOTES dedup gate with the existing
step 3 so one file with both rewrites surfaces as one audit entry, not two.

Layer 4 (frontmatter-inference.ts): serializer emits the same canonical
single-quote form by default. Inferred frontmatter on import and `--fix`
output now match byte-for-byte.

Layer 5 (frontmatter-guard SKILL.md): new "Prevention" section showing
canonical vs JSON-style arrays + the JSON.stringify trap that produces
the non-canonical form. Future agent writes start canonical.

Parity test added to markdown-validation.test.ts pinning agreement between
per-value safeLoad parsing and gray-matter full-document parse on the
load-bearing inputs.

PR #1238's "Layer 3" (put_page auto-normalization) was dropped during
plan review: put_page parses YAML into typed fields and hashes them, so
single-quoted vs double-quoted arrays are functionally identical in
storage. The fix lives where the writes happen, not on the read path.

Source PRs absorbed: #1217 (closed, serializer fix) + #1238 (closed,
four-layer defense-in-depth narrowed to three layers). PR #1229 already
merged as v0.37.5.0.

Co-authored-by: garrytan-agents <garrytan-agents@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants