Engineering rigor for AI assistants

Ship with confidence.

17 skills that turn AI coding assistants into careful senior engineers — catching bugs, validating changes, and maintaining code health with concrete, repo-grounded evidence.

Works with
  • Claude
  • Codex
  • Cursor
  • Gemini CLI
  • GitHub Copilot
  • OpenClaw
01

The upgrade

From vague suggestions to evidence-backed action.

Before

Generic AI Output

  • Looks fine to me, maybe add some tests.
  • Consider refactoring this function.
  • You might want to check for edge cases.
  • This could have performance implications.
  • Make sure to update the docs.

After

With swe-skills

  • Missing null check at lib/auth.ts:34 — will crash on expired tokens.
  • Migration 20260401.sql adds NOT NULL with no default — will fail on existing rows.
  • Bus factor 1 on billing/ — only contributor inactive 4 months.
  • p99 latency +180ms since PR #231 merged — N+1 query in fetchOrders.
  • README install section references removed --legacy flag.

Concrete. Grounded. Actionable.

02

The framework

17 skills across the full engineering lifecycle.

Run one at a time, schedule them to run continuously, or let your AI assistant pick the right one.

Understand

2 skills

Map unfamiliar repos and capture implicit knowledge before making changes.

Validate

4 skills

Review PRs for risk, shepherd live review cycles, hunt recent regressions, and plan validation paths.

Maintain

7 skills

Monitor deploys, audit dependencies, find observability gaps, track ownership risk.

Improve

4 skills

Close test gaps, set repo-local defaults, find refactor opportunities, and build new skills.

03

The skill set

Every skill, with example prompts and output.

Understand

swe:capture-knowledge

Convert implicit repo patterns into explicit agent-facing guidance.

Example prompt

What conventions should agents know before working here?

Sample output

Missing No AGENTS.md — 3 implicit conventions found

Convention All API routes use zod validation middleware

Convention DB migrations must have matching rollback

Convention Feature flags via LaunchDarkly, not env vars

Draft 3 entries ready for review → approve to write

Validate

swe:pr-risk-review

Review open PRs for engineering risk before merge — missing validation, hidden coupling, rollout gaps.

Example prompt

Review PR #247 for risk before I merge it.

Sample output

Scope 12 files across api/ and lib/

Risk No input validation on POST /users (api/users.ts:47)

Risk Migration adds NOT NULL, no default — will fail

Note Feature flag present but no rollback documented

Action Add zod schema + write rollback migration

swe:babysit-pr

Watch one PR in a live loop and handle comments, review requests, scores, and CI until it is ready to merge.

Example prompt

Babysit PR #247 until reviewers are happy and CI is green.

Sample output

Loop Polling PR #247 every 60s

Ack Reacted to new review thread with eyes

Fix Pushed follow-up for failing auth test + reviewer note

Score 4/5 → 5/5 after second revision

Ready CI green, no unresolved feedback, safe to merge

swe:change-validation-planner

Plan the narrowest trustworthy validation path for a scoped code change.

Example prompt

What's the minimum I should run to validate this diff?

Sample output

Changed api/users.ts, lib/validate.ts, 1 migration

Step 1 npm test -- users.test.ts (proves route logic)

Step 2 npm run lint (catches import errors)

Step 3 npx migrate:dry-run (proves migration)

Unverified No e2e test covers POST /users flow

swe:recent-commit-bug-hunt

Scan recent commits for likely bugs using concrete repo evidence.

Example prompt

Scan the last 3 days of commits for regressions.

Sample output

Strong PR #231 removed null guard at lib/auth.ts:34

  Will throw TypeError on expired tokens in prod

Moderate PR #228 changed status enum values

  No migration for 12k existing rows with old values

Fix Restore guard + add enum migration script

Maintain

swe:merged-pr-monitoring

Check whether merged PRs reached production and compare pre/post-deploy signals.

Example prompt

Did yesterday's merges cause any production issues?

Sample output

PR #244 Deployed 14:32 UTC — no impact observed

PR #245 Deployed 15:01 UTC — p99 latency +180ms

  GET /orders: 220ms → 400ms post-deploy

  Likely N+1 in fetchOrders (added in this PR)

PR #246 Not yet deployed — queued behind staging

swe:security-audit

Audit dependencies for vulnerabilities, outdated packages, and license issues.

Example prompt

Run a security audit across all services.

Sample output

Critical jsonwebtoken@8.5.1 — CVE-2022-23529

  api/package.json — upgrade to >=9.0.0

High express@4.17.1 — 3 known vulns, 2 years old

License worker uses AGPL dep (pdf-parse) — review

Clean shared-lib — 0 findings, lockfile present

swe:observability-gap-hunt

Find missing logs, metrics, traces, alerts, and dashboards.

Example prompt

What telemetry gaps do we have in the payment service?

Sample output

Blind No error logging in payment/retry.ts

  Silent failures on Stripe webhook retries

Blind No latency metric on /checkout endpoint

Weak Alert on error rate but threshold is 10% (too high)

Covered Traces present on charge + refund flows

swe:ownership-risk-map

Map bus factor, orphaned files, and ownership gaps from git evidence.

Example prompt

Which areas of this repo have dangerous bus factor?

Sample output

Bus 1 billing/ — sole contributor left 4 months ago

  47 files, 0 CODEOWNERS, 12% test coverage

Bus 1 infra/terraform/ — 1 active contributor

Moderate worker/ — 2 contributors, low recent activity

Healthy api/src/routes/ — 6 contributors, active

swe:performance-hunt

Find real performance bottlenecks with profiler data and measured evidence.

Example prompt

Why is the /orders endpoint slow? Profile it.

Sample output

Bottleneck N+1 query in fetchOrders — 340ms of 420ms

  1 query per line item × avg 18 items = 306 queries

Secondary JSON serialization — 55ms (large payload)

Fix Add eager load: include: { lineItems: true }

Expected p50: 420ms → ~80ms after fix

swe:docs-drift-audit

Find documentation that drifted from recent code, config, or interface changes.

Example prompt

Are our docs still accurate after last week's refactor?

Sample output

Stale README install section references --legacy flag

  Flag removed in PR #239 — delete from step 3

Stale API docs list /v1/users (renamed to /v2/users)

Missing No runbook for new payment retry flow

Current CONTRIBUTING.md — matches repo conventions

swe:incident-followup-audit

Verify post-incident engineering follow-through — tests, monitors, runbooks, tickets.

Example prompt

Did we finish all follow-up from the auth outage?

Sample output

Done Regression test added (auth/token-expiry.test.ts)

Done Alert threshold lowered from 10% to 2% error rate

Missing No runbook for token rotation procedure

Missing JIRA-1847 still open — rate limiter not shipped

Partial Monitoring added but no dashboard link in wiki

Improve

swe:test-gap-hunt

Incrementally close the highest-value test coverage gaps.

Example prompt

Find the weakest tests and biggest coverage gaps.

Sample output

Gap billing/charge.ts — 0% coverage, 14 recent PRs

Gap worker/process.ts — 0% coverage, error-prone

Weak auth/login.test.ts — happy path only, no edge cases

Plan 4 tests to add, narrowest first, est. 25 min

Backlog 6 more opportunities ranked for next pass

swe:init

Create a small repo-local .ai/swe.json so later swe: skills match how you like to work.

Example prompt

Run swe:init --quick --gitignore for this repo.

Sample output

Wrote .ai/swe.json with quick defaults

Ignored Added .ai/swe.json to .gitignore

Mode quick --gitignore

Saved Only non-default overrides

Rule Explicit user requests still outrank local prefs

swe:refactor-opportunities

Find small, parallelizable refactor tickets with clear boundaries.

Example prompt

Give me 5 low-risk refactors I can hand to agents.

Sample output

#1 Extract shared validation into lib/validate.ts

  3 routes duplicate the same 40-line schema — low risk

#2 Remove dead feature flag ENABLE_V1_AUTH

  Flag always true in prod for 6 months — 4 files

#3 Collapse OrderStatus enum (3 unused values since v2)

Extensible

swe:create-skill

Author new swe: skills with matching eval suites. Build your own.

Example prompt

Create a new swe: skill for auditing API contract changes.

Sample output

Created skills/api-contract-audit/SKILL.md

Created evals/api-contract-audit/cases.json (6 cases)

Created evals/api-contract-audit/rubric.md

Triggers "check API contracts", "breaking change review"

Non-goals Runtime testing, load testing, docs generation

04

Suggested schedules

Run once, or run continuously.

Every PR
  • swe:pr-risk-review
  • swe:babysit-pr
  • swe:change-validation-planner

Catch risk early, then shepherd the PR across the finish line.

Daily
  • swe:recent-commit-bug-hunt
  • swe:merged-pr-monitoring
  • swe:test-gap-hunt
  • swe:docs-drift-audit
  • swe:security-audit
  • swe:observability-gap-hunt
  • swe:performance-hunt
  • swe:refactor-opportunities
  • swe:capture-knowledge
  • swe:incident-followup-audit

Catch regressions, gaps, and drift while context is fresh.

Weekly
  • swe:ownership-risk-map
  • swe:repo-introspection

Structural checks that don't change day-to-day.

05

Get started

One command. Every major AI harness.

Install the full SWE skills framework, then start with swe:repo-introspection to understand your codebase or swe:pr-risk-review on your next PR.

  • Works with Claude, Codex, Cursor, Gemini CLI, GitHub Copilot, and OpenClaw-compatible setups
  • Language and framework agnostic — works on any codebase
  • Evidence-led: every finding cites files, lines, commits, or metrics
  • Designed for recurring use, not one-off runs

Install

npx skills install ckorhonen/swe-skills

Installs all 17 skills into your project. Works with any agent that supports the skills standard.

06

Frequently asked

For engineers moving from curiosity to practice.

Who is this for?

Software engineers, tech leads, and platform teams who want their AI coding assistants to be more rigorous — catching real bugs, validating changes with evidence, and maintaining code health systematically.

Does it work with any language or framework?

Yes. Skills are language-agnostic and adapt to whatever tooling your repo already uses — npm, cargo, pip, bundler, go modules, or anything else. They read your repo, not a config file.

How is this different from linters or CI checks?

Linters check syntax and style. CI runs predefined tests. These skills reason about your code — finding bugs linters miss, validating changes holistically, and producing actionable engineering judgment rather than pass/fail signals.

Can I run these on a schedule?

Absolutely. Skills like swe:recent-commit-bug-hunt, swe:test-gap-hunt, and swe:security-audit are designed to run repeatedly. Output formats support comparison across runs.

Where should I start?

Run swe:repo-introspection to understand your codebase, then try swe:pr-risk-review on your next pull request. Both produce immediate value.