Skip to content

v0.41.7.0 feat: compact list-format resolver + 300-skill scaling tutorial#1407

Merged
garrytan merged 7 commits into
masterfrom
garrytan/pr1370-production-ready
May 25, 2026
Merged

v0.41.7.0 feat: compact list-format resolver + 300-skill scaling tutorial#1407
garrytan merged 7 commits into
masterfrom
garrytan/pr1370-production-ready

Conversation

@garrytan

Copy link
Copy Markdown
Owner

Summary

Productionizes upstream PR #1370 (compact list-format resolver support in parseResolverEntries) plus ships the docs context that explains when and why an agent would use it. Closes the bug class where a 306-skill OpenClaw agent reported 238 FAIL errors on every gbrain doctor run because the parser only spoke the markdown-table dialect.

Parser: src/core/check-resolvable.ts:parseResolverEntries gains a second branch (alongside the existing table branch) that reads the compact OpenClaw-native list shape: - **gift-advisor**: gift idea | birthday gift | what should I bring. The branch is structured with an if/else-if shape so the existing continue on non-table rows no longer dead-codes the list path.

Contract: kebab-lowercase name regex ([a-z][a-z0-9-]+) kills the prose-bullet false-positive class (- **Note**:, - **Convention**:, - **TODO**: are silently skipped, NOT parsed as fake skill rows). Path is always derived as skills/<name>/SKILL.md; an optional Unicode or ASCII -> suffix is stripped from the trigger string but not honored as the path (downstream consumers routing-eval.ts:skillSlugFromPath and the manifest check at check-resolvable.ts:367 both assume the convention).

Tests: 11 new unit cases in test/check-resolvable.test.ts covering bold + plain name shapes, multi-trigger fan-out, Unicode and ASCII arrow suffix strip, ellipsis filter, empty pipe segments, mixed-shape files, section tracking, and two D4 regression cases. 8 new integration cases in test/check-resolvable-openclaw-compact.test.ts across two fixtures: openclaw-compact-resolver/ (list-only, pins the 238 FAILs → 0 outcome AND the prose-bullet rejection) and openclaw-mixed-merge/ (table + parent AGENTS.md, pins the v0.31.7 D-CX-14 multi-resolver merge case).

Docs: docs/guides/scaling-skills.md is a new tutorial covering the three-tier scaling architecture (always-loaded, resolver-routed, dormant) with real numbers from Garry's 306-skill OpenClaw (25K tokens of skill descriptions per turn collapsed to 4K tokens, ~21K tokens freed per turn, zero capability loss). Registered in scripts/llms-config.ts; docs/UPGRADING_DOWNSTREAM_AGENTS.md excluded from the inlined bundle to stay under the 600KB FULL_SIZE_BUDGET. CLAUDE.md's check-resolvable.ts annotation extended with the v0.41.7.0 paragraph.

Replaces #1370. Credit @garrytan-agents for the original PR submission that flagged the gap.

Test Coverage

11 new unit cases + 8 new integration cases. CLI smoke test via gbrain check-resolvable --json --skills-dir test/fixtures/openclaw-compact-resolver/skills returns ok: true with 10/10 reachable, 0 errors, 0 warnings. Same smoke test against the mixed-merge fixture returns ok: true with 8/8 reachable.

[+] src/core/check-resolvable.ts
  └── parseResolverEntries()
      ├── heading match                            [★★ TESTED — existing]
      ├── table branch (UNCHANGED)                 [★★★ TESTED — existing]
      └── list branch (NEW)
          ├── bold name single + multi-trigger     [★★★ NEW]
          ├── plain name fallback                  [★★ NEW]
          ├── path suffix (Unicode + ASCII) strip  [★★★ NEW]
          ├── ellipsis filter                      [★★ NEW]
          ├── empty-trigger drop                   [★★ NEW]
          ├── section tracking across list rows    [★★ NEW]
          ├── mixed table + list in one file       [★★ NEW]
          └── prose-bullet false-positive guard    [★★★ NEW — D4 regression]

[+] test/check-resolvable-openclaw-compact.test.ts (NEW)
  ├── compact fixture: 238 FAILs → 0              [★★★ NEW — bisect anchor]
  ├── prose-bullet integration regression          [★★★ NEW — D4 integration]
  └── mixed-merge: table + parent AGENTS.md        [★★★ NEW — D-CX-14]

COVERAGE: 100% of new code paths tested.

Pre-Landing Review

Cleared via /plan-eng-review before implementation; codex outside-voice ran on the plan and surfaced 12 findings, 6 of which walked into decisions D3-D6 (D1 reversed: drop explicit-path capture; D4 added: kebab-lowercase name regex; D5: fixture upgrade with valid frontmatter + prose-bullet section + mixed-merge sibling; D6: full E2E lifecycle), 3 became mechanical fixes (F1 parser restructure, F10 llms-config registration, F11 em-dash phrasing), 2 filed as TODOs (F8 path-traversal in pre-existing table-format parser, F9 fan-out/dedup doc paragraph), 1 mooted (F4 by D3 walkback). No findings ungated this commit.

Plan Completion

All 12 implementation tasks from the plan completed:

  • T1: parser rewrite with if/else-if shape + kebab-lowercase regex + dual-arrow strip ✓
  • T2: 11 new unit cases ✓
  • T3: openclaw-compact-resolver fixture with prose-bullet section ✓
  • T4: openclaw-mixed-merge fixture (table + parent AGENTS.md) ✓
  • T5: regression test with strict error+warning assertions ✓
  • T6: scaling-skills.md tutorial with privacy/voice cleanup ✓
  • T7: llms-config registration + size-budget rebalance ✓
  • T8: tutorials README link ✓
  • T9: VERSION 0.41.7.0 + CHANGELOG ELI10 entry ✓
  • T10: bun run verify + bun run test
  • T11: close upstream PR feat: parseResolverEntries supports list-based resolver format #1370 (follow-up after this PR lands)
  • T12: F8 + F9 TODOs filed in TODOS.md ✓

TODOS

  • Added: v0.41.7.0 resolver-parser follow-ups block in TODOS.md with two Codex P3 deferrals (F8 path-traversal hardening for the existing table parser, F9 fan-out/dedup doc paragraph in scaling-skills.md) plus a P1 entry for an audit-writer week-boundary test flake caught during ship.

Documentation

  • CLAUDE.md — extended the src/core/check-resolvable.ts annotation with a v0.41.7.0 paragraph covering the compact list format spec, the kebab-lowercase name gate, the path-suffix-stripped-not-honored contract, multi-trigger fan-out + downstream dedup semantics, the 238 FAILs → 0 OpenClaw headline, the two integration fixtures, and the new scaling-skills tutorial cross-reference.
  • llms-full.txt + llms.txt — regenerated via bun run build:llms to absorb the CLAUDE.md edit and pick up the new guide entry.
  • CHANGELOG.md — v0.41.7.0 entry already covers the change with ELI10 lead + before/after table + things-to-watch + itemized changes + contributor credit.
  • TODOS.md — v0.41.7.0 follow-ups already filed.
  • VERSION trio — aligned at 0.41.7.0 across VERSION / package.json / topmost CHANGELOG header.
  • docs/tutorials/README.md — one-line "Related documentation" link to the new guide.

Test plan

  • bun run verify (typecheck + 16 pre-checks) — clean
  • bun test parallel fast loop — 10391 pass / 1 pre-existing flake (audit-writer.test.ts week-boundary failure when real UTC date crosses ISO week boundary mid-test; not my code; filed as P1 TODO in TODOS.md with a one-line refactor recommendation for createAuditWriter.log())
  • bun test test/check-resolvable.test.ts test/check-resolvable-openclaw-compact.test.ts test/build-llms.test.ts test/resolver-merge.test.ts test/resolver.test.ts test/resolvers.test.ts test/routing-eval.test.ts — 248 pass / 0 fail
  • gbrain check-resolvable --json against both new fixtures — ok: true, 10/10 reachable + 8/8 reachable
  • gbrain check-resolvable --json against real repo skills/ok: true, no regression on table format

🤖 Generated with Claude Code

garrytan and others added 5 commits May 24, 2026 22:44
Add the second parser branch alongside the existing markdown-table branch
so RESOLVER.md and AGENTS.md can use the OpenClaw-native list shape:

    - **skill-name**: trigger1 | trigger2 | trigger3
    - skill-name: trigger1 | trigger2

Constraints:
  - Skill names must be kebab-lowercase ([a-z][a-z0-9-]+). Bold names
    starting with an uppercase letter (e.g. **Note**, **Convention**)
    are deliberately skipped so prose bullets in real-world AGENTS.md
    files don't get mis-parsed as fake skill rows.
  - skillPath is always derived as skills/<name>/SKILL.md. An optional
    arrow suffix (Unicode -> or ASCII ->) is stripped from the trigger
    string but NOT honored as a path. Downstream consumers
    (routing-eval.ts skillSlugFromPath, the manifest check at line 367)
    assume the convention. For non-conventional paths, use the table
    format.
  - Multiple triggers fan out to one entry per trigger. checkResolvable
    dedupes by skillPath downstream, so the reachability count counts
    each skill once regardless of trigger fan-out.

The parser body is restructured to an if/else-if shape so the existing
'continue' on non-table rows no longer short-circuits the list branch.

Unit tests cover 11 new cases: bold + plain name shapes, multi-trigger
fan-out, Unicode and ASCII path-suffix strip, ellipsis filter, empty
pipe segments, mixed-shape files, section tracking, and two D4
regression cases (prose-bullet rejection + convention-violation
silent-skip).

Closes #1370 — credit @garrytan-agents for the original PR that flagged
the parser gap.
…ompact format

Two fixtures pin the v0.41.7.0 parser fix at the integration layer:

  test/fixtures/openclaw-compact-resolver/
    List-format only RESOLVER.md with 10 fictional skills (gift-advisor,
    flight-tracker, email-triage, etc.), each with valid frontmatter
    triggers. A trailing 'Notes' section embeds 4 prose bullets
    (- **Note**:, - **Convention**:, - **TODO**:, - **Important**:)
    that pin the D4 kebab-lowercase regex tighten: if the regex ever
    regresses to permissive [\w-]+, those prose bullets would surface
    as orphan_trigger warnings and the test fails loudly.

  test/fixtures/openclaw-mixed-merge/
    Tests the v0.31.7 D-CX-14 multi-resolver merge: workspace-root
    AGENTS.md (compact list, 3 skills) + skills/RESOLVER.md (table
    format, 5 skills). The merge dedups by skillPath and counts each
    skill once.

The regression test (test/check-resolvable-openclaw-compact.test.ts)
runs 8 assertions across both fixtures:

  1. unreachable === 0 on the compact fixture (the 'pre-v0.41.7.0
     reported 238 FAILs on a 306-skill OpenClaw, post-fix 0' headline).
  2. zero error-severity issues; report.ok === true.
  3. zero mece_gap warnings (every stub ships valid triggers).
  4. zero orphan_trigger warnings for the 4 prose-bullet names — D4
     regex regression guard at integration level.
  5. zero missing_file warnings.
  6. mixed-merge: total_skills === 8 (5 table + 3 list), all reachable.
  7. mixed-merge: errors.length === 0; report.ok === true.
  8. mixed-merge: each expected skill from BOTH shapes is non-unreachable
     (catches the bug where one shape silently swallows the other via
     dedup-by-skillPath).
Three-tier architecture for agents that have outgrown the always-loaded
skill manifest:

  Tier A — always loaded (~35 skills, in the system prompt every turn)
  Tier B — resolver-routed (~85 skills, looked up via RESOLVER.md/AGENTS.md
            only when no Tier A match)
  Tier C — dormant (~180 skills, on disk but not injected into the prompt)

Real numbers from Garry's 306-skill OpenClaw: 25K tokens of skill
descriptions per turn collapsed to 4K tokens (~21K tokens freed per
turn) with zero capability loss. The compact list-format resolver
(v0.41.7.0) is the parser-level enabler for this pattern.

The guide covers:

  - The scaling wall (when the always-loaded manifest stops working)
  - The three tiers + per-turn token math
  - What the resolver actually does (routing-table-but-cheaper pattern)
  - The compact list format (kebab-lowercase contract, optional path
    suffix, mixed-shape support)
  - The 'gbrain doctor' / 'gbrain check-resolvable --strict' safety net
  - Implementation walkthrough (audit → tier → disable → resolver →
    doctor)
  - The scaling curve (50 → 100 → 200 → 300 → 1000, no ceiling)

Voice + privacy cleanup applied per CLAUDE.md rules:
  - Wintermute → 'Garry's OpenClaw' / 'your OpenClaw'
  - Unicode em dashes stripped; ASCII '--' preserved in command flags
  - Made-up 'check_resolvable' invocation replaced with real
    'gbrain doctor' and 'gbrain check-resolvable --json'/'--strict'
  - Blog-style 'Previous in this series' footer dropped

Wiring:
  - scripts/llms-config.ts registers the new guide in the curated
    array so 'bun run build:llms' picks it up. docs/UPGRADING_
    DOWNSTREAM_AGENTS.md excluded from the inlined bundle to stay
    under the 600KB FULL_SIZE_BUDGET after adding the new content.
  - docs/tutorials/README.md gains a one-line entry pointing at the
    guide under Related documentation.
  - llms.txt + llms-full.txt regenerated.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Annotate the src/core/check-resolvable.ts entry with the v0.41.7.0
parseResolverEntries compact list-format support: kebab-lowercase name
gate (closes the prose-bullet false-positive class), path-suffix strip
contract (skillPath always derived as skills/<name>/SKILL.md so
routing-eval and the manifest check don't drift), multi-trigger fan-out
plus checkResolvable downstream dedupe, the 238 FAILs to 0 OpenClaw
headline, the two integration fixtures pinning the regression, and the
docs/guides/scaling-skills.md pointer for the tutorial context.

Regenerate llms-full.txt to match (CLAUDE.md edit chaser, per the
CLAUDE.md own rule about test/build-llms.test.ts catching drift).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added 2 commits May 25, 2026 12:50
…duction-ready

# Conflicts:
#	CHANGELOG.md
#	TODOS.md
#	VERSION
#	package.json
…duction-ready

# Conflicts:
#	CHANGELOG.md
#	VERSION
#	package.json
@garrytan garrytan merged commit 374deff into master May 25, 2026
15 checks passed
garrytan added a commit that referenced this pull request May 25, 2026
Master advanced past v0.41.6.0:
- v0.41.7.0: compact list-format resolver + 300-skill scaling tutorial (#1407)

Resolved VERSION + package.json + CHANGELOG conflicts. v0.41.9.0 still
holds. Auto-merge took master's expanded `includeInFull: false` exclusions
in scripts/llms-config.ts (the schema docs, ZE provider walkthrough,
llama-server reranker doc, UPGRADING_DOWNSTREAM_AGENTS, CHANGELOG) which
brings llms-full.txt down to 590KB. Combined with our v0.41.9.0 700KB
budget bump that's now 110KB of headroom (belt + suspenders).

Regenerated llms-full.txt (590,324 bytes — under both new + old budgets).

3-line audit: VERSION + package.json + CHANGELOG all agree on 0.41.9.0.
Verify clean: all 21 checks green; check-test-isolation OK (692 files
scanned); build-llms tests 7/7 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mgunnin added a commit to mgunnin/gbrain that referenced this pull request May 28, 2026
* upstream/master:
  v0.41.10.1 fix-wave: dream.* config + batch retry + extract_atoms idempotency + ze-switch env-gate (garrytan#1445)
  v0.41.10.0 feat: orphan reduction via --by-mention + UTF-16 surrogate-pair fix (garrytan#1442)
  v0.41.9.0 — UX/reliability fix wave (5 defects from production report) (garrytan#1440)
  v0.41.8.0 fix(pglite): search/query/get exit cleanly + garrytan#1340 hint + garrytan#1342 breadcrumbs (garrytan#1405)
  v0.41.7.0 feat: compact list-format resolver + 300-skill scaling tutorial (garrytan#1407)
  v0.41.6.0 feat(ci): CI test speedup — 23min → ~9min via matrix 4→6 + weight-aware sharding + auto SHA cache + parallel verify (garrytan#1444)
  v0.41.5.0 fix-wave: warm-narwhal — 6 community PRs + E2E reliability (garrytan#1374)

# Conflicts:
#	src/core/ai/recipes/openai.ts
garrytan-agents pushed a commit to garrytan-agents/gbrain that referenced this pull request Jun 13, 2026
…rial (garrytan#1407)

* feat(check-resolvable): parseResolverEntries accepts compact list format

Add the second parser branch alongside the existing markdown-table branch
so RESOLVER.md and AGENTS.md can use the OpenClaw-native list shape:

    - **skill-name**: trigger1 | trigger2 | trigger3
    - skill-name: trigger1 | trigger2

Constraints:
  - Skill names must be kebab-lowercase ([a-z][a-z0-9-]+). Bold names
    starting with an uppercase letter (e.g. **Note**, **Convention**)
    are deliberately skipped so prose bullets in real-world AGENTS.md
    files don't get mis-parsed as fake skill rows.
  - skillPath is always derived as skills/<name>/SKILL.md. An optional
    arrow suffix (Unicode -> or ASCII ->) is stripped from the trigger
    string but NOT honored as a path. Downstream consumers
    (routing-eval.ts skillSlugFromPath, the manifest check at line 367)
    assume the convention. For non-conventional paths, use the table
    format.
  - Multiple triggers fan out to one entry per trigger. checkResolvable
    dedupes by skillPath downstream, so the reachability count counts
    each skill once regardless of trigger fan-out.

The parser body is restructured to an if/else-if shape so the existing
'continue' on non-table rows no longer short-circuits the list branch.

Unit tests cover 11 new cases: bold + plain name shapes, multi-trigger
fan-out, Unicode and ASCII path-suffix strip, ellipsis filter, empty
pipe segments, mixed-shape files, section tracking, and two D4
regression cases (prose-bullet rejection + convention-violation
silent-skip).

Closes garrytan#1370 — credit @garrytan-agents for the original PR that flagged
the parser gap.

* test(check-resolvable): integration fixtures + regression suite for compact format

Two fixtures pin the v0.41.7.0 parser fix at the integration layer:

  test/fixtures/openclaw-compact-resolver/
    List-format only RESOLVER.md with 10 fictional skills (gift-advisor,
    flight-tracker, email-triage, etc.), each with valid frontmatter
    triggers. A trailing 'Notes' section embeds 4 prose bullets
    (- **Note**:, - **Convention**:, - **TODO**:, - **Important**:)
    that pin the D4 kebab-lowercase regex tighten: if the regex ever
    regresses to permissive [\w-]+, those prose bullets would surface
    as orphan_trigger warnings and the test fails loudly.

  test/fixtures/openclaw-mixed-merge/
    Tests the v0.31.7 D-CX-14 multi-resolver merge: workspace-root
    AGENTS.md (compact list, 3 skills) + skills/RESOLVER.md (table
    format, 5 skills). The merge dedups by skillPath and counts each
    skill once.

The regression test (test/check-resolvable-openclaw-compact.test.ts)
runs 8 assertions across both fixtures:

  1. unreachable === 0 on the compact fixture (the 'pre-v0.41.7.0
     reported 238 FAILs on a 306-skill OpenClaw, post-fix 0' headline).
  2. zero error-severity issues; report.ok === true.
  3. zero mece_gap warnings (every stub ships valid triggers).
  4. zero orphan_trigger warnings for the 4 prose-bullet names — D4
     regex regression guard at integration level.
  5. zero missing_file warnings.
  6. mixed-merge: total_skills === 8 (5 table + 3 list), all reachable.
  7. mixed-merge: errors.length === 0; report.ok === true.
  8. mixed-merge: each expected skill from BOTH shapes is non-unreachable
     (catches the bug where one shape silently swallows the other via
     dedup-by-skillPath).

* docs(guides): scaling-skills.md walkthrough for 300-skill agents

Three-tier architecture for agents that have outgrown the always-loaded
skill manifest:

  Tier A — always loaded (~35 skills, in the system prompt every turn)
  Tier B — resolver-routed (~85 skills, looked up via RESOLVER.md/AGENTS.md
            only when no Tier A match)
  Tier C — dormant (~180 skills, on disk but not injected into the prompt)

Real numbers from Garry's 306-skill OpenClaw: 25K tokens of skill
descriptions per turn collapsed to 4K tokens (~21K tokens freed per
turn) with zero capability loss. The compact list-format resolver
(v0.41.7.0) is the parser-level enabler for this pattern.

The guide covers:

  - The scaling wall (when the always-loaded manifest stops working)
  - The three tiers + per-turn token math
  - What the resolver actually does (routing-table-but-cheaper pattern)
  - The compact list format (kebab-lowercase contract, optional path
    suffix, mixed-shape support)
  - The 'gbrain doctor' / 'gbrain check-resolvable --strict' safety net
  - Implementation walkthrough (audit → tier → disable → resolver →
    doctor)
  - The scaling curve (50 → 100 → 200 → 300 → 1000, no ceiling)

Voice + privacy cleanup applied per CLAUDE.md rules:
  - Wintermute → 'Garry's OpenClaw' / 'your OpenClaw'
  - Unicode em dashes stripped; ASCII '--' preserved in command flags
  - Made-up 'check_resolvable' invocation replaced with real
    'gbrain doctor' and 'gbrain check-resolvable --json'/'--strict'
  - Blog-style 'Previous in this series' footer dropped

Wiring:
  - scripts/llms-config.ts registers the new guide in the curated
    array so 'bun run build:llms' picks it up. docs/UPGRADING_
    DOWNSTREAM_AGENTS.md excluded from the inlined bundle to stay
    under the 600KB FULL_SIZE_BUDGET after adding the new content.
  - docs/tutorials/README.md gains a one-line entry pointing at the
    guide under Related documentation.
  - llms.txt + llms-full.txt regenerated.

* chore: bump version and changelog (v0.41.7.0)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs: update CLAUDE.md for v0.41.7.0 compact-format resolver

Annotate the src/core/check-resolvable.ts entry with the v0.41.7.0
parseResolverEntries compact list-format support: kebab-lowercase name
gate (closes the prose-bullet false-positive class), path-suffix strip
contract (skillPath always derived as skills/<name>/SKILL.md so
routing-eval and the manifest check don't drift), multi-trigger fan-out
plus checkResolvable downstream dedupe, the 238 FAILs to 0 OpenClaw
headline, the two integration fixtures pinning the regression, and the
docs/guides/scaling-skills.md pointer for the tutorial context.

Regenerate llms-full.txt to match (CLAUDE.md edit chaser, per the
CLAUDE.md own rule about test/build-llms.test.ts catching drift).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant