Skip to content

chore: add /codebase-audit skill for deep parallel codebase auditing#613

Merged
Aureliolo merged 4 commits intomainfrom
chore/codebase-audit-skill
Mar 19, 2026
Merged

chore: add /codebase-audit skill for deep parallel codebase auditing#613
Aureliolo merged 4 commits intomainfrom
chore/codebase-audit-skill

Conversation

@Aureliolo
Copy link
Copy Markdown
Owner

@Aureliolo Aureliolo commented Mar 19, 2026

Summary

  • Adds /codebase-audit slash command -- a reusable Claude Code skill that orchestrates 20+ specialized parallel agents to audit the entire codebase (or targeted scope) for issues
  • Methodology refined from a battle-tested audit session that produced 30 validated findings across 11 work packages with a 32% false positive rate on initial findings

Key Features

  • Architecture research phase before launching audit agents (agents without context produce false positives)
  • Parallel agent swarm -- each agent searches for one specific issue type only
  • Mandatory validation pass -- never presents unvalidated findings to the user
  • Skepticism for clean results -- zero findings triggers a deeper, architecture-informed investigation
  • Work package grouping by code proximity -- groups what a developer would naturally fix together, not by severity
  • Existing issue deduplication -- twice: once in agent prompts, once before issue creation
  • Configurable scope -- full, src/, web/, cli/, docs/, .github/
  • Configurable output -- --report-only to skip issue creation

Skill Phases

  1. Parse arguments and determine scope
  2. Gather context (existing issues + architecture research)
  3. Select and launch audit agents (parallel)
  4. Collect and deduplicate findings
  5. Validate findings against actual code (parallel)
  6. Deep dive on suspicious clean results
  7. Present validated findings to user
  8. Group into work packages by code proximity
  9. Final dedup and create GitHub issues

Test plan

  • Invoke /codebase-audit full and verify all phases execute
  • Invoke /codebase-audit src/synthorg/api/ for targeted scope
  • Invoke /codebase-audit --report-only and verify no issues created
  • Verify agent prompts include architecture context
  • Verify validation phase removes false positives before presentation

Summary by CodeRabbit

  • New Features
    • Introduces a Deep Codebase Audit workflow with selectable full or targeted scopes and optional report-only and quick modes.
    • Performs multi-phase, parallel analysis with mandatory validation, dual deduplication, and exclusion of intentional patterns.
    • Produces grouped developer work packages and, after user confirmation, can create tracked issues for confirmed findings and report created issues.

Adds a reusable Claude Code skill that orchestrates 20+ specialized
agents to audit the entire codebase (or targeted scope) for issues.

Key features:
- Architecture research phase BEFORE launching audit agents
- Parallel agent swarm (each agent searches for one issue type)
- Mandatory validation pass (32% false positive rate observed)
- Skepticism for "100% clean" results triggers deeper investigation
- Work package grouping by code proximity, not severity
- Existing issue deduplication (twice: in prompts + before creation)
- Configurable scope (full, src/, web/, cli/, docs/, .github/)

Methodology refined from a battle-tested audit session that produced
30 validated findings across 11 work packages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 19, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a powerful new /codebase-audit skill, significantly enhancing the automated code quality and issue detection capabilities. It enables a comprehensive, agent-driven analysis of the codebase, ensuring that identified issues are thoroughly validated and organized into actionable work packages before being presented or converted into GitHub issues. This system aims to provide high-fidelity findings and streamline the process of addressing technical debt and vulnerabilities.

Highlights

  • New Codebase Audit Skill: Introduced a new /codebase-audit slash command, a reusable Claude Code skill designed for deep parallel codebase auditing.
  • Agent Orchestration: The skill orchestrates over 20 specialized parallel agents to audit the entire codebase or a targeted scope for various issues.
  • Robust Methodology: Incorporates a battle-tested methodology including an architecture research phase, mandatory validation of findings, skepticism for clean results, and grouping of work packages by code proximity.
  • Issue Management: Features existing issue deduplication (both in agent prompts and before issue creation) and configurable output options, including --report-only to skip issue creation.
  • Phased Execution: Outlines an 8-phase process covering argument parsing, context gathering, agent selection and launch, findings collection and deduplication, validation, deep dives, presentation, work package grouping, and final issue creation.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 19, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: a888846d-b443-4db2-bdb9-4a2239869f6c

📥 Commits

Reviewing files that changed from the base of the PR and between 33538ec and e80e60d.

📒 Files selected for processing (1)
  • .claude/skills/codebase-audit/SKILL.md

Walkthrough

Adds a new Claude skill at .claude/skills/codebase-audit/SKILL.md that implements a multi‑phase "Deep Codebase Audit" workflow. It parses CLI scope (full default or targeted dirs) and flags --report-only and --quick; fetches up to 200 open GitHub issues and reads architecture/testing/DI/observability/design files to build an Architecture Brief; runs parallel, specialized audit agents; merges/deduplicates findings; validates findings against source lines; optionally deep‑dives zero‑result categories; groups confirmed findings into proximity‑based work packages; and creates one GitHub issue per work package.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'chore: add /codebase-audit skill for deep parallel codebase auditing' clearly and specifically describes the main change: adding a new Claude skill for codebase auditing with parallel execution. It directly matches the content of the changeset.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a very thorough and well-structured /codebase-audit skill. The multi-phase approach, emphasizing context-gathering, mandatory validation, and skepticism, is excellent. The developer-centric grouping of findings by code proximity is also a great feature. My feedback focuses on improving the clarity, consistency, and logical flow of the skill's definition, particularly around its arguments and conditional steps.

@@ -0,0 +1,395 @@
---
description: "Deep codebase audit: launches specialized parallel agents to find issues, validates findings, groups into work packages, and creates GitHub issues"
argument-hint: "<scope: full | src/ | web/ | cli/ | docs/ | .github/> [--report-only] [--skip-issues]"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve clarity and avoid redundancy, it's best to have a single flag for a given behavior. Both --report-only and --skip-issues (defined on line 36) have the same effect. I recommend removing --skip-issues and using --report-only as the canonical flag.

Suggested change
argument-hint: "<scope: full | src/ | web/ | cli/ | docs/ | .github/> [--report-only] [--skip-issues]"
argument-hint: "<scope: full | src/ | web/ | cli/ | docs/ | .github/> [--report-only]"

| `docs/` or `site/` | Documentation/site | Docs/content categories |
| `.github/` or `ci` | CI/CD only | CI/workflow categories |
| `--report-only` | Any scope | Skip issue creation, report only |
| `--skip-issues` | Any scope | Same as --report-only |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This argument is redundant with --report-only. To simplify the skill's interface and align with the change suggested for line 3, it's best to remove this line.

| `test-coverage` | Public modules with no corresponding test file, empty test files |
| `flaky-tests` | Unmocked time, real asyncio.sleep in tests, timing-dependent assertions, skipped tests |
| `wiring-lifecycle` | Incorrectly wired services, missing DI, lifecycle gaps, protocol implementations incomplete |
| `security-gaps` | Hardcoded secrets, missing auth guards, injection vectors, SSRF, XSS |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The security-gaps agent is listed under "Python Backend Agents", but its search scope ("Hardcoded secrets, missing auth guards, injection vectors, SSRF, XSS") includes issues that are cross-cutting and can appear in frontend, infrastructure, or CLI code.

Consider either:

  • Moving security-gaps to the "Cross-Cutting Agents" section to reflect its broad scope.
  • Creating separate, stack-specific security agents (e.g., for Vue, Go, Docker) to ensure comprehensive security coverage.

4. **Validate any new findings** (same as Phase 4)
5. Add validated findings to the main list

Skip this phase if the user passed `--quick` or if the zero-finding categories are genuinely well-covered (e.g., dependencies audit finding nothing is believable).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This line mentions a --quick flag, but it is not defined in the skill's arguments in Phase 0 (line 3 and the table on lines 27-36). For the skill to be complete and consistent, please add --quick to the argument-hint and the arguments table, describing its purpose of skipping this deep-dive phase.

Comment on lines +283 to +286
Ask the user:
1. **"Proceed to group into work packages and create issues" (Recommended)**
2. "Show me the full detail for each finding first"
3. "Export as markdown report only (no issues)"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The user gate offers an option to "create issues", but this doesn't account for the user having already passed the --report-only flag at the start. The skill's logic should be conditional. If --report-only was specified, this user gate should be skipped or should not present options related to creating issues.

|---|---------|
| 1 | ... |
| 2 | ... |
**Rationale:** Why these go together.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This **Rationale:** line is inside the example code block, but it breaks the markdown table format and seems misplaced. It should likely be regular text outside of the code block, following the table for each work package, to ensure the example is valid.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/skills/codebase-audit/SKILL.md:
- Line 3: The argument hint string (argument-hint) is incomplete and
inconsistent with later phases; update the argument-hint value so it lists all
supported scopes and flags referenced elsewhere (e.g., include ci, docker/,
site/, --quick, --report-only, --skip-issues) so the CLI/help is discoverable
and consistent across phases; locate the argument-hint entries (the literal
"argument-hint" lines at 3, 33-34, 118, 256-257 in this file) and standardize
them to a single canonical set of options used throughout the skill
documentation.
- Line 4: The allowed-tools list currently includes write-capable tools ("Edit",
"Write") which contradicts the audit-only constraint described later; modify the
SKILL.md allowed-tools declaration to remove any mutable/write-capable entries
(specifically remove "Edit" and "Write") so the skill only permits read-only
tools (e.g., "Read", "Grep", "Glob", "WebFetch", etc.), and ensure the updated
list aligns with the later prohibition on fixing code referenced around the rule
at line 395.
- Line 13: Update the wording at the top rule that currently reads “validation
is mandatory before ANY output to the user” to the proposed clearer phrase
“Never present unvalidated findings” so it no longer conflicts with the later
guidance in the document (specifically the guidance around pre-validation
reporting at lines referencing “pre-validation reporting”); replace the exact
sentence and ensure any nearby references that repeat the stricter phrasing are
updated to match the new wording to avoid contradiction.
- Line 143: The markdown file contains unlabeled fenced code blocks causing
markdownlint MD040 failures; update each unlabeled triple-backtick fence in
.claude/skills/codebase-audit/SKILL.md to include an explicit language
identifier (e.g., ```md or ```text) for every occurrence flagged (the fences
around the sample table and the other four unlabeled blocks), keeping the
existing fence contents unchanged and only adding the language token after the
opening backticks so linting and syntax highlighting work correctly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: 54d54983-9314-4172-8aba-7afd57629a6b

📥 Commits

Reviewing files that changed from the base of the PR and between c8bf862 and 90c61dd.

📒 Files selected for processing (1)
  • .claude/skills/codebase-audit/SKILL.md

…sistency agent

- Remove Edit/Write from allowed-tools (contradicts audit-only rule 12)
- Consolidate argument-hint: add missing scopes (ci, docker/, site/), add --quick flag, remove redundant --skip-issues
- Remove --skip-issues from Phase 0 table (redundant with --report-only)
- Add --quick flag definition to Phase 0 table
- Soften validation wording to avoid contradiction with progress reporting
- Move security-gaps agent from Python Backend to Cross-Cutting Agents
- Make Phase 6 user gate conditional on --report-only flag
- Fix Rationale line placement (was inside code block, now outside)
- Add language identifiers to all unlabeled fenced code blocks (MD040)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
.claude/skills/codebase-audit/SKILL.md (1)

3-3: ⚠️ Potential issue | 🟡 Minor

Align argument-hint with documented src/synthorg/ scope.

Phase 0 documents src/synthorg/ as valid, but argument-hint omits it, making help text inconsistent.

Suggested doc alignment
-argument-hint: "<scope: full | src/ | web/ | cli/ | docs/ | site/ | .github/ | ci | docker/> [--report-only] [--quick]"
+argument-hint: "<scope: full | src/ | src/synthorg/ | web/ | cli/ | docs/ | site/ | .github/ | ci | docker/> [--report-only] [--quick]"

Also applies to: 30-30

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/skills/codebase-audit/SKILL.md at line 3, Update the argument-hint
string to include the documented src/synthorg/ scope so help text matches Phase
0 docs; locate the argument-hint definition (the literal argument-hint: "<scope:
...>") in .claude/skills/codebase-audit/SKILL.md and add "src/synthorg/" to the
scope enumeration (e.g. include "src/synthorg/" alongside existing scopes) so
the help output and documentation are consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/skills/codebase-audit/SKILL.md:
- Line 95: Remove the duplicate agent definition named "security-gaps" from the
Python backend section and keep only the cross-cutting definition; specifically,
delete the redundant `security-gaps` entry in the Python backend block (also
duplicated later) so there is a single canonical agent definition under the
cross-cutting section and ensure any references or scheduling config point to
that one symbol.
- Around line 119-125: The infra agent mappings are ambiguous: ensure
`docker-infra` only targets the `docker/` scope and `ci-workflows` only targets
the `.github/` scope by changing the table rows so `docker-infra` lists only
"Dockerfile issues, compose config, port security, healthchecks" with scope
`docker/`, and `ci-workflows` lists only "Missing timeouts, script injection,
permissions gaps, silent failures" with scope `.github/`; also update the
repeated wording at lines referencing 34-35 to match this scope-specific mapping
to prevent cross-run of agents.

---

Duplicate comments:
In @.claude/skills/codebase-audit/SKILL.md:
- Line 3: Update the argument-hint string to include the documented
src/synthorg/ scope so help text matches Phase 0 docs; locate the argument-hint
definition (the literal argument-hint: "<scope: ...>") in
.claude/skills/codebase-audit/SKILL.md and add "src/synthorg/" to the scope
enumeration (e.g. include "src/synthorg/" alongside existing scopes) so the help
output and documentation are consistent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: fea5077c-4b93-414c-b237-cd2bdc5dbdb2

📥 Commits

Reviewing files that changed from the base of the PR and between 90c61dd and e16565d.

📒 Files selected for processing (1)
  • .claude/skills/codebase-audit/SKILL.md

- Add src/synthorg/ to argument-hint for consistency with Phase 0 docs
- Remove duplicate security-gaps from Python Backend (keep cross-cutting only)
- Clarify infra agent scoping: docker-infra for docker/, ci-workflows for .github/

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/skills/codebase-audit/SKILL.md:
- Around line 172-175: Replace the "Launch ALL selected agents in parallel"
behavior by implementing a bounded worker pool that enqueues selected agents and
dispatches at most a configurable concurrency (suggest 6-10) to the Agent tool
using run_in_background: true; update the launching logic that currently emits
"Launched N audit agents in parallel. Waiting for results..." to instead report
how many were started immediately, how many are queued, and overall total, and
add retry/backoff for transient failures (exponential backoff with a few
retries) when invoking the Agent tool; look for the launch/dispatch code that
constructs Agent tool calls (references: run_in_background: true and the
"Launched N audit agents in parallel. Waiting for results..." message) and
replace the fire-and-forget loop with a bounded worker queue, concurrency
limiter, and retry/backoff.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: 20888417-c379-4343-9809-4b3f76f5c184

📥 Commits

Reviewing files that changed from the base of the PR and between e16565d and 33538ec.

📒 Files selected for processing (1)
  • .claude/skills/codebase-audit/SKILL.md

Comment on lines +172 to +175
Launch ALL selected agents in parallel using the Agent tool with `run_in_background: true`. Give each a descriptive `name` for tracking.

Track agent count and report to user: "Launched N audit agents in parallel. Waiting for results..."

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Bound parallelism instead of launching all agents at once.

Line 172’s “Launch ALL selected agents in parallel” can overload local resources and hit GitHub/API rate limits, causing partial or failed audits. Add a bounded worker pool (for example, max 6-10 concurrent agents) with queueing and retry/backoff.

Suggested update
-Launch ALL selected agents in parallel using the Agent tool with `run_in_background: true`. Give each a descriptive `name` for tracking.
+Launch selected agents using bounded parallelism (worker pool) with `run_in_background: true`.
+Recommended: max 6-10 concurrent agents, queue the rest, and apply retry/backoff on transient failures.
+Give each agent a descriptive `name` for tracking.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/skills/codebase-audit/SKILL.md around lines 172 - 175, Replace the
"Launch ALL selected agents in parallel" behavior by implementing a bounded
worker pool that enqueues selected agents and dispatches at most a configurable
concurrency (suggest 6-10) to the Agent tool using run_in_background: true;
update the launching logic that currently emits "Launched N audit agents in
parallel. Waiting for results..." to instead report how many were started
immediately, how many are queued, and overall total, and add retry/backoff for
transient failures (exponential backoff with a few retries) when invoking the
Agent tool; look for the launch/dispatch code that constructs Agent tool calls
(references: run_in_background: true and the "Launched N audit agents in
parallel. Waiting for results..." message) and replace the fire-and-forget loop
with a bounded worker queue, concurrency limiter, and retry/backoff.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Aureliolo Aureliolo merged commit db02320 into main Mar 19, 2026
21 of 22 checks passed
@Aureliolo Aureliolo deleted the chore/codebase-audit-skill branch March 19, 2026 16:45
Aureliolo added a commit that referenced this pull request Mar 19, 2026
🤖 I have created a release *beep* *boop*
---


##
[0.3.8](v0.3.7...v0.3.8)
(2026-03-19)


### Features

* dynamic settings UI with auto-discovery and basic/advanced toggle
([#600](#600))
([11b32b7](11b32b7))


### Bug Fixes

* **ci:** remove dst: . from GoReleaser archive config
([#598](#598))
([c8bf862](c8bf862))
* **engine:** wire compaction_callback and plan_execute_config through
auto-selection
([#601](#601))
([795327f](795327f))
* harden setup wizard completion and status checks
([#616](#616))
([d99d7b7](d99d7b7))


### Maintenance

* add /codebase-audit skill for deep parallel codebase auditing
([#613](#613))
([db02320](db02320))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant