Agent = Model + Harness

The linter for your agent harness.

Your AGENTS.md, CLAUDE.md, CI, and hooks are the agent harness. AgentLint lints it — for Claude Code, Cursor, and Codex.

Audits CLAUDE.md, AGENTS.md, .cursor/rules, .github/copilot-instructions.md, plus your CI workflows, pre-commit hooks, and .gitignore — every check cites a primary source.

Read the Docs →

$ /al

AgentLint — Score: 68/100

Findability██████████████░░░░░░7/10

Instructions████████████████░░░░8/10

Workability████████████░░░░░░░░6/10

Safety██████████░░░░░░░░░░5/10

Continuity██████████████░░░░░░7/10

Harness████████████████████10/10

Deep░░░░░░░░░░░░░░░░░░░░0/10

Session████████████████████10/10

Fix Plan (7 items):

[guided] Pin 8 GitHub Actions to SHA (supply chain risk)

[guided] Add .env to .gitignore (AI exposes secrets)

[assisted] Generate HANDOFF.md

[guided] Reduce IMPORTANT keywords (7 found, Anthropic uses 4)

› Select items → AgentLint fixes → re-scores → saves HTML report

Works with every major AI coding agent

Claude Code

OpenAI Codex

Cursor

GitHub Copilot

Google Gemini CLI

Windsurf

Cline

Claude Code

OpenAI Codex

Cursor

GitHub Copilot

Google Gemini CLI

Windsurf

Cline

01 / The harness problem

Agent = Model + Harness.
The model isn't the bottleneck anymore.

In February 2026, Mitchell Hashimoto (HashiCorp) coined harness engineering. Days later, OpenAI formalized it. LangChain gave the cleanest definition: if you're not the model, you're the harness — and the harness is where the real bottleneck lives.

70%

Of your AI agent's performance lives outside the model — in the harness. Same weights, different harness, different results.

LANGCHAIN · FEB 2026

40%

Fewer “bad suggestion” sessions on teams maintaining a good context file at the top of their harness.

ANTHROPIC · 2026 TRENDS REPORT

5/8

Scenarios where auto-generated context files actually reduced agent success rates. Inference cost: +20–23%.

ETH ZURICH · 2025

60K+

OSS repos that now ship an AGENTS.md. OpenAI donated the standard to the Linux Foundation in Dec 2025.

AGENTS.MD · LINUX FOUNDATION

What's in your harness?

For a coding agent, the harness isn't a framework — it's already in your repo. AGENTS.md and CLAUDE.md are the persistent rules. .cursor/rules and copilot-instructions.md are the tool-specific layers. CI, pre-commit hooks, and .gitignore are the deterministic constraints the agent cannot override. That's the harness. That's what AgentLint audits.

02 / Why AgentLint

The first linter for the harness itself.

Other tools generate templates. Other tools score prompts. AgentLint is the only one that measures the health of your whole harness — and every check cites a primary source, not a blog post.

→ 01

Built on 265 versions of Anthropic's own prompt

We tracked every word Anthropic added, deleted, and rewrote in Claude Code's system prompt. When they cut IMPORTANT from 12 uses to 4, we knew. Now your harness can follow the same rules.

→ 02

The harness hard limits nobody told you

40,000-character entry files get silently truncated. 256 KB files can't be read. Pre-commit hooks that take too long hang your session forever — Claude Code never uses --no-verify. We read the source so you don't have to.

→ 03

Scans. Scores. Auto-fixes.

One command. AgentLint scans your repo, scores the harness across 5 dimensions, shows exactly what's broken, and fixes what it can — pinning Actions to SHAs, adding .env to gitignore, generating handoffs.

03 / The 33 checks

Five dimensions.
Weighted by how much they move the needle.

Scores are measurements, not judgments. Reference values come from Anthropic's own data. You decide what to fix.

WEIGHT · 20%

Findability

9 checks

Can AI find what it needs? Entry files, indexes, resolvable references, standard naming.

WEIGHT · 30%

Instructions

8 checks

Are your rules well-written? Emphasis density, specificity, file length, the 40K hard limit.

WEIGHT · 20%

Workability

11 checks

Can AI build and test? Commands, CI, real tests, linter, the 256 KB file ceiling, hook speed.

WEIGHT · 15%

Continuity

6 checks

Can the next session pick up? Freshness, handoffs, meaningful changelogs, plans in repo.

WEIGHT · 15%

Safety

9 checks

Is AI working securely? .env gitignored, Actions SHA-pinned, workflow permissions, no hardcoded secrets.

04 / Not another /init or rules collection

Templates and rule lists are not audits. AgentLint is.

/init generates a template. cclint lints one file's syntax. claude-code-rules collects rules to copy-paste. None of them measure whether the harness as a whole is healthy — and none auto-fix what they find.

Capability	/init	cclint	claude-code-rules	AgentLint
Generates a template CLAUDE.md	✓	—	—	—
Audits entry-file quality against 265 prompt versions	—	—	—	✓
Finds broken @include references	—	✓	—	✓
Enforces 40,000-char hard limit	—	—	—	✓
Audits CI, hooks, .gitignore, Actions SHA	—	—	—	✓
Detects instruction rot over time	—	—	—	✓
Auto-fixes what it finds	—	—	—	✓
Every check cites a data source	—	—	—	✓

05 / The data sources

Every check has a citation.

If a check can't be backed by a primary source, it doesn't ship. That's the rule.

Primary dataset

265 versions of Anthropic's Claude Code system prompt

Every addition, deletion, and rewrite tracked across time. The ground truth for how Anthropic writes prompts for Anthropic's model.

Source code

Claude Code runtime internals

Hard limits and behaviors: 40,000-char max, 256 KB file ceiling, pre-commit hook behavior, @include resolution.

NeurIPS · 2025

IFScale: Instruction compliance at scale

How emphasis, specificity, and structure affect model compliance across thousands of instructions.

ETH Zurich · 2026

Do context files help coding agents?

Developer-written files help. LLM-generated files hurt in 5 of 8 scenarios. Inference cost rises 20–23%.

Codified Context · 2026

Stale content as #1 failure mode

Documentation decay is the single biggest reason agents fail on codebases they previously succeeded on.

Agent READMEs · 2025

Concrete vs abstract effectiveness

One real example outperforms three paragraphs describing it. The counterintuitive convention is the most valuable one to document.

06 / FAQ

Questions, answered.

The term got popular in early 2026 (Mitchell Hashimoto, OpenAI, LangChain). Shortest definition: Agent = Model + Harness. The harness is everything that wraps an LLM and turns it into an agent — tools, state, feedback loops, and the persistent rules it reads at session start. For coding agents, that last part is your AGENTS.md, CLAUDE.md, .cursor/rules, CI, pre-commit hooks, and .gitignore. AgentLint is the first linter built specifically for that layer.

Get started

Install agentlint and tighten the CLAUDE.md instructions your agents rely on.

Pick your harness

Use agentlint with the coding agent workflow and CLAUDE.md shape you already have.

Guides for stronger CLAUDE.md files and cleaner agentlint results.

Score your harness
in under 10 seconds.

One command. 33 evidence-backed checks. Every one with a citation.