Before
Generic AI Output
- Looks fine to me, maybe add some tests.
- Consider refactoring this function.
- You might want to check for edge cases.
- This could have performance implications.
- Make sure to update the docs.
Engineering rigor for AI assistants
17 skills that turn AI coding assistants into careful senior engineers — catching bugs, validating changes, and maintaining code health with concrete, repo-grounded evidence.
The upgrade
Before
After
lib/auth.ts:34 — will crash on expired tokens.20260401.sql adds NOT NULL with no default — will fail on existing rows.billing/ — only contributor inactive 4 months.fetchOrders.--legacy flag.Concrete. Grounded. Actionable.
The framework
2 skills
Map unfamiliar repos and capture implicit knowledge before making changes.
4 skills
Review PRs for risk, shepherd live review cycles, hunt recent regressions, and plan validation paths.
7 skills
Monitor deploys, audit dependencies, find observability gaps, track ownership risk.
4 skills
Close test gaps, set repo-local defaults, find refactor opportunities, and build new skills.
The skill set
Understand
Example prompt
Help me understand this repo before I start working on it.
Sample output
Structure Monorepo — 3 services + shared lib
Entry api/src/server.ts, worker/index.ts
Tests Jest (unit), Playwright (e2e), 94% passing
CI GitHub Actions — lint, test, deploy on merge
Safe start api/src/routes/ — well-tested, active area
Caution worker/ — no tests, single contributor
Example prompt
What conventions should agents know before working here?
Sample output
Missing No AGENTS.md — 3 implicit conventions found
Convention All API routes use zod validation middleware
Convention DB migrations must have matching rollback
Convention Feature flags via LaunchDarkly, not env vars
Draft 3 entries ready for review → approve to write
Validate
Example prompt
Review PR #247 for risk before I merge it.
Sample output
Scope 12 files across api/ and lib/
Risk No input validation on POST /users (api/users.ts:47)
Risk Migration adds NOT NULL, no default — will fail
Note Feature flag present but no rollback documented
Action Add zod schema + write rollback migration
Example prompt
Babysit PR #247 until reviewers are happy and CI is green.
Sample output
Loop Polling PR #247 every 60s
Ack Reacted to new review thread with eyes
Fix Pushed follow-up for failing auth test + reviewer note
Score 4/5 → 5/5 after second revision
Ready CI green, no unresolved feedback, safe to merge
Example prompt
What's the minimum I should run to validate this diff?
Sample output
Changed api/users.ts, lib/validate.ts, 1 migration
Step 1 npm test -- users.test.ts (proves route logic)
Step 2 npm run lint (catches import errors)
Step 3 npx migrate:dry-run (proves migration)
Unverified No e2e test covers POST /users flow
Example prompt
Scan the last 3 days of commits for regressions.
Sample output
Strong PR #231 removed null guard at lib/auth.ts:34
Will throw TypeError on expired tokens in prod
Moderate PR #228 changed status enum values
No migration for 12k existing rows with old values
Fix Restore guard + add enum migration script
Maintain
Example prompt
Did yesterday's merges cause any production issues?
Sample output
PR #244 Deployed 14:32 UTC — no impact observed
PR #245 Deployed 15:01 UTC — p99 latency +180ms
GET /orders: 220ms → 400ms post-deploy
Likely N+1 in fetchOrders (added in this PR)
PR #246 Not yet deployed — queued behind staging
Example prompt
Run a security audit across all services.
Sample output
Critical jsonwebtoken@8.5.1 — CVE-2022-23529
api/package.json — upgrade to >=9.0.0
High express@4.17.1 — 3 known vulns, 2 years old
License worker uses AGPL dep (pdf-parse) — review
Clean shared-lib — 0 findings, lockfile present
Example prompt
What telemetry gaps do we have in the payment service?
Sample output
Blind No error logging in payment/retry.ts
Silent failures on Stripe webhook retries
Blind No latency metric on /checkout endpoint
Weak Alert on error rate but threshold is 10% (too high)
Covered Traces present on charge + refund flows
Example prompt
Which areas of this repo have dangerous bus factor?
Sample output
Bus 1 billing/ — sole contributor left 4 months ago
47 files, 0 CODEOWNERS, 12% test coverage
Bus 1 infra/terraform/ — 1 active contributor
Moderate worker/ — 2 contributors, low recent activity
Healthy api/src/routes/ — 6 contributors, active
Example prompt
Why is the /orders endpoint slow? Profile it.
Sample output
Bottleneck N+1 query in fetchOrders — 340ms of 420ms
1 query per line item × avg 18 items = 306 queries
Secondary JSON serialization — 55ms (large payload)
Fix Add eager load: include: { lineItems: true }
Expected p50: 420ms → ~80ms after fix
Example prompt
Are our docs still accurate after last week's refactor?
Sample output
Stale README install section references --legacy flag
Flag removed in PR #239 — delete from step 3
Stale API docs list /v1/users (renamed to /v2/users)
Missing No runbook for new payment retry flow
Current CONTRIBUTING.md — matches repo conventions
Example prompt
Did we finish all follow-up from the auth outage?
Sample output
Done Regression test added (auth/token-expiry.test.ts)
Done Alert threshold lowered from 10% to 2% error rate
Missing No runbook for token rotation procedure
Missing JIRA-1847 still open — rate limiter not shipped
Partial Monitoring added but no dashboard link in wiki
Improve
Example prompt
Find the weakest tests and biggest coverage gaps.
Sample output
Gap billing/charge.ts — 0% coverage, 14 recent PRs
Gap worker/process.ts — 0% coverage, error-prone
Weak auth/login.test.ts — happy path only, no edge cases
Plan 4 tests to add, narrowest first, est. 25 min
Backlog 6 more opportunities ranked for next pass
Example prompt
Run swe:init --quick --gitignore for this repo.
Sample output
Wrote .ai/swe.json with quick defaults
Ignored Added .ai/swe.json to .gitignore
Mode quick --gitignore
Saved Only non-default overrides
Rule Explicit user requests still outrank local prefs
Example prompt
Give me 5 low-risk refactors I can hand to agents.
Sample output
#1 Extract shared validation into lib/validate.ts
3 routes duplicate the same 40-line schema — low risk
#2 Remove dead feature flag ENABLE_V1_AUTH
Flag always true in prod for 6 months — 4 files
#3 Collapse OrderStatus enum (3 unused values since v2)
Example prompt
Create a new swe: skill for auditing API contract changes.
Sample output
Created skills/api-contract-audit/SKILL.md
Created evals/api-contract-audit/cases.json (6 cases)
Created evals/api-contract-audit/rubric.md
Triggers "check API contracts", "breaking change review"
Non-goals Runtime testing, load testing, docs generation
Suggested schedules
swe:pr-risk-reviewswe:babysit-prswe:change-validation-plannerCatch risk early, then shepherd the PR across the finish line.
swe:recent-commit-bug-huntswe:merged-pr-monitoringswe:test-gap-huntswe:docs-drift-auditswe:security-auditswe:observability-gap-huntswe:performance-huntswe:refactor-opportunitiesswe:capture-knowledgeswe:incident-followup-auditCatch regressions, gaps, and drift while context is fresh.
swe:ownership-risk-mapswe:repo-introspectionStructural checks that don't change day-to-day.
Get started
Install the full SWE skills framework, then start with
swe:repo-introspection to understand your codebase
or swe:pr-risk-review on your next PR.
Install
npx skills install ckorhonen/swe-skills
Installs all 17 skills into your project. Works with any agent that supports the skills standard.
Frequently asked
Software engineers, tech leads, and platform teams who want their AI coding assistants to be more rigorous — catching real bugs, validating changes with evidence, and maintaining code health systematically.
Yes. Skills are language-agnostic and adapt to whatever tooling your repo already uses — npm, cargo, pip, bundler, go modules, or anything else. They read your repo, not a config file.
Linters check syntax and style. CI runs predefined tests. These skills reason about your code — finding bugs linters miss, validating changes holistically, and producing actionable engineering judgment rather than pass/fail signals.
Absolutely. Skills like swe:recent-commit-bug-hunt,
swe:test-gap-hunt, and
swe:security-audit are designed to run repeatedly.
Output formats support comparison across runs.
Run swe:repo-introspection to understand your
codebase, then try swe:pr-risk-review on your
next pull request. Both produce immediate value.