Skip to content

veralang.dev: close remaining Agent Score gaps (content negotiation, truncation budget, markdown parity) #525

@aallan

Description

@aallan

The site's Agent Score (tool at buildwithfern.com/agent-score; veralang.dev's report at buildwithfern.com/agent-score/company/veralang) scores documentation sites on how easily agents and LLMs can ingest their content. After the v0.0.119 homepage redesign (PR TBD) the score improved from 3 failures + 2 warnings to 2 failures + 1 warning, with the llms-txt-directive, llms-txt-freshness, page-size-html, and page-size-markdown checks all cleared, plus the markdown-content-parity miss rate reduced from 90% missing → 21% missing (4× improvement). Three issues remain, each requiring a different kind of fix:

1. content-negotiationAccept: text/markdown ignored

The Agent Score spec (see buildwithfern.com/agent-score) expects GET / with Accept: text/markdown to return docs/index.md. GitHub Pages serves every static file by extension and cannot honour content negotiation. Candidates to fix:

  • Put a Cloudflare Worker (or equivalent edge rule) in front of veralang.dev that inspects the Accept header and rewrites //index.md when text/markdown is preferred. Lowest-effort, no infra change beyond the Worker.
  • Move veralang.dev off GitHub Pages to a host that supports content negotiation natively (Cloudflare Pages with a _redirects file, Netlify with _redirects, or a self-hosted nginx with map rules). Larger footprint.

2. content-start-position — inline <style> consumes agent truncation budget

afdocs reports documentation content starts at the 50% mark of the converted page. The cause is the inline <style> block (~15K chars) and JSON-LD <script> block that sit before the first <body> prose. Candidates:

  • Extract the CSS to docs/style.css and link it. Violates the briefing's single-file constraint for docs/index.html — the site is deliberately one hand-edited HTML file. Would need an explicit decision to relax that constraint.
  • Move the JSON-LD <script> block to the end of <body> — it's structured metadata that agents read separately via the DOM, it doesn't need to sit before content. Tiny win, worth doing.
  • Accept the warning as a deliberate trade-off and document it. The briefing's "page weight is a feature" + "no frameworks" + "single hand-written index.html" is a coherent stance that doesn't bend to an external scoring rubric.

3. markdown-content-parity — 21% of HTML content missing from /index.md

scripts/build_site.py's build_index_md() now mirrors the HTML section-for-section (thesis, code samples, VeraBench stat + table, runtime, install, For Agents). The remaining 21% is interface chrome that doesn't translate naturally to markdown:

  • CTA button labels (GitHub / Install / For Agents → SKILL.md)
  • The @reader.0 → humans / @reader.1 → agents reading-path strip
  • Eyebrow labels (@SECTION.01 · THE THESIS etc.)
  • Version badge, CI badge

Closing further means duplicating interface strings in prose, which risks making /index.md read like a website-description document rather than a standalone specification. A reasonable target is a short "Site interface" footer in /index.md that names the device but doesn't copy every button label. Needs a judgement call.

Verification

Re-run afdocs locally after any fix:

cd docs && python3 -m http.server 8765 &
npx afdocs check http://localhost:8765 --fixes --verbose

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentationenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions