feat(security): add basic security reviewer agent with owasp skills#1008
feat(security): add basic security reviewer agent with owasp skills#1008katriendg merged 47 commits intomicrosoft:mainfrom
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1008 +/- ##
=======================================
Coverage 88.04% 88.04%
=======================================
Files 45 45
Lines 7885 7885
=======================================
Hits 6942 6942
Misses 943 943
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
I did that more because I wasn't sure on the approach we want to take with this. Since this is phase 1 of 4 phases I figured we'd be building on top of this and making adjustments as we go. The skills themselves ( That's my thinking though. Happy to take your lead if you see things differently 😊
Done in feat(agent): update CUSTOM-AGENTS.md with security review agent. |
katriendg
left a comment
There was a problem hiding this comment.
@JasonTheDeveloper - Thanks for all the work and some rework :) I believe this is good to merge into experimental (pre-release extension).
Valuable addition, excited to see this first version go live, we'll surely be further tweaking and then extending this collection as we get more usage feedback in.
🤖 I have created a release *beep* *boop* --- ## [3.2.0](hve-core-v3.1.46...hve-core-v3.2.0) (2026-03-20) ### ✨ Features * add -OutputPath parameter to Validate-MarkdownFrontmatter.ps1 ([#1134](#1134)) ([fdf1bcf](fdf1bcf)), closes [#1006](#1006) * add action version consistency scan workflow ([#1127](#1127)) ([4229df1](4229df1)) * **agent:** MVE Experiment Designer ([#976](#976)) ([70f86ca](70f86ca)) * **agents:** add ADO Backlog Manager orchestrator agent ([#800](#800)) ([fae3987](fae3987)) * **agents:** add meeting analyst agent for transcript analysis using work-iq ([#502](#502)) ([5345b5b](5345b5b)) * **agents:** add quick-reference line to RPI Phase 5 suggestions ([#897](#897)) ([9a90f39](9a90f39)) * **agents:** add RAI Planner, enhance SSSC Planner, and redesign Security Planner ([#979](#979)) ([06f826c](06f826c)) * **agents:** add symmetric cross-system handoff to GitHub Backlog Manager ([#952](#952)) ([ba34a35](ba34a35)) * **agents:** Functional Code Review Agent — pre-PR functional correctness reviewer ([#733](#733)) ([9cf63b7](9cf63b7)) * **build:** add Python extensions and uv 0.10.8 to devcontainer ([#920](#920)) ([9ca0579](9ca0579)) * **build:** add uv ecosystem to Dependabot configuration ([#913](#913)) ([2a4bd39](2a4bd39)) * **build:** enable npm pinning enforcement in dependency scan ([#838](#838)) ([4e9e31f](4e9e31f)) * **build:** migrate attestation actions to v4.1.0 and add SBOM verification docs ([#841](#841)) ([ca1e65b](ca1e65b)) * **collections:** add four new validator checks (orphan, duplicate, companion, coverage) ([#869](#869)) ([1a96b73](1a96b73)) * **devcontainer,security:** add enterprise artifact hub configuration ([#1032](#1032)) ([1d56d25](1d56d25)) * **docs:** add Rust coding standards and guidelines ([#809](#809)) ([d4c4899](d4c4899)) * **extension:** add Microsoft logo icon to VS Code Marketplace listings ([#906](#906)) ([82aca41](82aca41)) * **github:** add declarative label management ([#953](#953)) ([a1a6845](a1a6845)) * **instructions:** add ADO backlog shared infrastructure ([#786](#786)) ([1914078](1914078)) * **instructions:** add ADO backlog sprint planning and capacity tracking ([#788](#788)) ([d6fb77d](d6fb77d)) * **instructions:** add ADO triage workflow and prompt ([#787](#787)) ([cde0190](cde0190)) * **instructions:** add shared story quality conventions and sprint planning ([#803](#803)) ([a2f18e3](a2f18e3)) * **prompts:** add ADO discovery and work item prompts with agent routing ([#790](#790)) ([7e74523](7e74523)) * **prompts:** add security review prompts ([#1118](#1118)) ([ad30967](ad30967)) * **scripts:** add dynamic Python skill discovery for lint/test ([#957](#957)) ([0a90f57](0a90f57)) * **scripts:** add Get-StandardTimestamp utility to CIHelpers module ([#1126](#1126)) ([b273a4b](b273a4b)) * **scripts:** add Python copyright header validation ([#905](#905)) ([67df902](67df902)) * **scripts:** add Python skill support to Validate-SkillStructure ([#903](#903)) ([68479d9](68479d9)) * **scripts:** add workflow npm command scanning to dependency pinning ([#837](#837)) ([6b5ae06](6b5ae06)) * **security:** add basic security reviewer agent with owasp skills ([#1008](#1008)) ([cb1fd05](cb1fd05)) * **security:** add sigstore attestation bundles and fix component-detection action ([#1148](#1148)) ([f79c272](f79c272)) * **skills:** add Atheris fuzz harness with CI workflow integration ([#1102](#1102)) ([d337e1d](d337e1d)) * **skills:** add PowerPoint automation skill with YAML-driven deck generation ([#868](#868)) ([00465cd](00465cd)) * **skills:** convert hve-core-installer agent to self-contained skill ([#846](#846)) ([1d821fb](1d821fb)) * **skills:** enhance pr-reference skill with flexible filtering and base branch detection ([#1095](#1095)) ([26a32ea](26a32ea)) * **workflows:** add devcontainer infrastructure change log workflow ([#899](#899)) ([8aca446](8aca446)) * **workflows:** add milestone auto-close on stable and pre-release publishes ([#834](#834)) ([79362b1](79362b1)) * **workflows:** add ms.date documentation freshness checking ([#969](#969)) ([3ed441c](3ed441c)) * **workflows:** add Python linting CI workflow with Ruff ([#951](#951)) ([f89f0eb](f89f0eb)) * **workflows:** add Python testing CI workflow with pytest and Codecov ([#934](#934)) ([5e8306f](5e8306f)) * **workflows:** add uv and Python package sync to copilot-setup-steps ([#921](#921)) ([45d517d](45d517d)) ### 🐛 Bug Fixes * **build:** override Linguist vendored flag for Python skill files ([#1155](#1155)) ([0eee5b6](0eee5b6)) * **build:** override serialize-javascript to >=7.0.3 for RCE fix ([#876](#876)) ([e49039a](e49039a)) * **build:** resolve Pinned-Dependencies alerts for vsce npm commands in extension workflows ([#782](#782)) ([89dad9d](89dad9d)) * **build:** update undici and yauzl overrides for security audit ([#1030](#1030)) ([2c2f92f](2c2f92f)) * **docs:** add CLI Plugins to install.md navigation surfaces ([#902](#902)) ([79d6595](79d6595)) * **docs:** add sidebar ordering for Design Thinking documentation ([#832](#832)) ([551fddc](551fddc)), closes [#830](#830) * **docs:** graduate design-thinking to preview and correct stale collection references ([#831](#831)) ([5110e35](5110e35)) * **docs:** include project-planning in UX Designer install guidance ([#908](#908)) ([e7aa9bc](e7aa9bc)) * **docs:** remediate writing-style convention violations ([#865](#865)) ([68b04bc](68b04bc)) * **docs:** remove draft content announcement banner ([#825](#825)) ([b45de80](b45de80)) * **docs:** remove unbounded path-to-regexp override breaking SSG ([#1153](#1153)) ([d810018](d810018)) * **docs:** use actual clone paths instead of folder display names in multi-root workspace settings ([#984](#984)) ([5dbab82](5dbab82)) * **instructions:** replace black with ruff in uv-projects ([#898](#898)) ([b0c06d9](b0c06d9)) * **scripts:** cover .github/ skill files in copyright header validation ([#1055](#1055)) ([#1098](#1098)) ([27fbd33](27fbd33)) * **scripts:** eliminate phantom git changes from plugin generation ([#1035](#1035)) ([e49a1b5](e49a1b5)) * **scripts:** enable JSON log output for lint:version-consistency ([#1033](#1033)) ([52b0885](52b0885)) * **security:** calculate compliance score from total scanned dependencies ([#930](#930)) ([c112c3d](c112c3d)) * **skills:** add AST validation and namespace restriction for content-extra.py ([#1027](#1027)) ([c50c7a3](c50c7a3)) * **skills:** add depth limits to recursive PowerPoint processing functions ([#1028](#1028)) ([bf08994](bf08994)) * **skills:** harden XML parsing and blob writes in powerpoint extract ([#1053](#1053)) ([89d24b1](89d24b1)) * **skills:** resolve ruff lint and format violations in powerpoint skill ([#1048](#1048)) ([17bbe7a](17bbe7a)) * **workflows:** add uv.lock dependencies submission have fork-skip condition ([#1109](#1109)) ([dec56ac](dec56ac)) * **workflows:** automate weekly SHA staleness check with issue creation ([#975](#975)) ([1ea4caa](1ea4caa)) * **workflows:** close Codecov integration gaps for Pester and pytest flags ([#1106](#1106)) ([cca29b7](cca29b7)) * **workflows:** propagate uv sync errors in copilot-setup-steps ([#961](#961)) ([df88d7c](df88d7c)) * **workflows:** resolve release-please skip cascade and Python project discovery ([#1043](#1043)) ([79993e2](79993e2)) * **workflows:** scan only commit subjects for breaking change detection ([#1157](#1157)) ([a38a657](a38a657)) ### 📚 Documentation * clarify HVE Core Extension vs Installer messaging across documentation ([#965](#965)) ([0fceb8f](0fceb8f)) * **docs:** add ADO integration user documentation ([#935](#935)) ([ec89302](ec89302)) * **docs:** add Project Planning agent documentation ([#936](#936)) ([3a3a0fd](3a3a0fd)) * **onboarding:** overhaul marketplace onboarding and documentation site ([#982](#982)) ([4309e10](4309e10)) ### ♻️ Refactoring * **build:** merge code-review collection into coding-standards ([#863](#863)) ([8027e7b](8027e7b)) * **workflows:** rename release pipeline workflows and add marketplace automation triggers ([#829](#829)) ([b6397f4](b6397f4)) ### 🔧 Maintenance * **build:** add clean:logs npm script ([#1122](#1122)) ([f85fe02](f85fe02)), closes [#988](#988) * **build:** add JSON reporter for cspell ([#1123](#1123)) ([6d59f67](6d59f67)) * **ci:** add multi-arch support to copilot-setup-steps binary downloads ([#955](#955)) ([8d0c706](8d0c706)) * **deps-dev:** bump cspell from 9.6.4 to 9.7.0 in the npm-dependencies group ([#839](#839)) ([3fa16ff](3fa16ff)) * **deps:** bump actions/dependency-review-action from 4.8.3 to 4.9.0 in the github-actions group across 1 directory ([#942](#942)) ([1a9b858](1a9b858)) * **deps:** bump cairosvg from 2.8.2 to 2.9.0 in /.github/skills/experimental/powerpoint ([#1025](#1025)) ([f4deda7](f4deda7)) * **deps:** bump dompurify from 3.3.1 to 3.3.2 in /docs/docusaurus ([#924](#924)) ([d2060d6](d2060d6)) * **deps:** bump svgo from 3.3.2 to 3.3.3 in /docs/docusaurus ([#880](#880)) ([6dc2406](6dc2406)) * **deps:** bump the github-actions group across 1 directory with 4 updates ([#1100](#1100)) ([2290dc0](2290dc0)) * **deps:** bump the github-actions group with 6 updates ([#840](#840)) ([f57bc01](f57bc01)) * **docs:** correct New-MsDateReport table rendering and refresh stale docs ([#1114](#1114)) ([c2b806f](c2b806f)) * **settings:** remove orphaned Checkov config and stale gitignore entries ([#870](#870)) ([98fcd74](98fcd74)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Co-authored-by: hve-core-release-please[bot] <254602402+hve-core-release-please[bot]@users.noreply.github.com> Co-authored-by: Bill Berry <wberry@microsoft.com>
Pull Request
Description
This PR introduces a new Security Reviewer agent containing skills relating to the follow OWASP related content:
The agent contains 3 modes to fit a rang of different usecases and they are:
auditdiffplanThere are also two additional inputs you can pass:
targetSkill: Run a particular skill - skips codebase profiling stepscope: Restricts agent to a particular directory/fileDetail agent flow:
flowchart TD Start([User Invokes Security Reviewer]) --> SetDate["Pre-req: Set report date"] SetDate --> DetectMode{"Detect scanning mode"} DetectMode -->|"explicit or keywords:<br/>changes, branch, PR"| DiffMode["Mode = diff"] DetectMode -->|"explicit or keywords:<br/>plan, design, RFC"| PlanMode["Mode = plan"] DetectMode -->|"default / explicit"| AuditMode["Mode = audit"] DetectMode -->|"invalid mode"| InvalidStop([Stop: Invalid mode]) %% Step 0: Mode-specific setup AuditMode --> StatusSetup["Status: Starting in audit mode"] DiffMode --> GitDetect["Detect default branch<br/>git symbolic-ref"] PlanMode --> ResolvePlan["Resolve plan document<br/>from input / context / fallback"] GitDetect -->|"fail"| FallbackAudit["Fallback to audit mode"] FallbackAudit --> StatusSetup GitDetect -->|"ok"| MergeBase["Compute merge base<br/>git merge-base"] MergeBase -->|"fail"| FallbackAudit MergeBase -->|"ok"| ChangedFiles["Get changed files<br/>git diff --name-only"] ChangedFiles -->|"fail"| FallbackAudit ChangedFiles -->|"no files"| EmptyStop([Stop: No changed files]) ChangedFiles -->|"files found"| FilterFiles["Filter non-assessable<br/>.md .yml .json images etc."] FilterFiles -->|"empty after filter"| FilterStop([Stop: No assessable code files]) FilterFiles -->|"assessable files"| StatusSetup ResolvePlan -->|"no plan found"| AskUser["Ask user for plan path"] AskUser --> ResolvePlan ResolvePlan -->|"plan resolved"| ReadPlan["Read plan document"] --> StatusSetup %% Step 1: Profile Codebase StatusSetup --> TargetSkill{"targetSkill<br/>provided?"} TargetSkill -->|"yes"| ValidateSkill{"Skill in<br/>Available Skills?"} ValidateSkill -->|"no"| SkillStop([Stop: Show available skills]) ValidateSkill -->|"yes"| StubProfile["Build minimal profile stub<br/>skip Codebase Profiler"] StubProfile --> SetSkills1["Applicable skills = targetSkill only"] TargetSkill -->|"no"| RunProfiler[/"Subagent: Codebase Profiler<br/>mode-specific prompt"/] RunProfiler -->|"fail"| ProfileFail([Stop: Profiling failed]) RunProfiler -->|"ok"| IntersectSkills["Intersect profiler skills<br/>with Available Skills"] IntersectSkills --> SpecificOverride{"Specific skills<br/>list provided?"} SpecificOverride -->|"yes"| OverrideSkills["Override with provided list<br/>intersect with Available Skills"] SpecificOverride -->|"no"| CheckEmpty{"Any applicable<br/>skills?"} OverrideSkills --> CheckEmpty CheckEmpty -->|"none"| NoSkillStop([Stop: No applicable skills]) CheckEmpty -->|"skills found"| SetSkills2["Set applicable skills list"] SetSkills1 --> StatusProfile["Status: Profiling complete"] SetSkills2 --> StatusProfile %% Step 2: Assess Skills StatusProfile --> AssessLoop["Status: Beginning skill assessments"] AssessLoop --> ForEachSkill["For each applicable skill<br/>(parallel when supported)"] ForEachSkill --> RunAssessor[/"Subagent: Skill Assessor<br/>mode-specific prompt per skill"/] RunAssessor -->|"incomplete"| RetryAssessor[/"Retry Skill Assessor<br/>(once)"/] RetryAssessor -->|"still fails"| ExcludeSkill["Exclude skill from results"] RetryAssessor -->|"ok"| CollectFindings["Collect structured findings"] RunAssessor -->|"ok"| CollectFindings ExcludeSkill --> AllDone{"All skills<br/>processed?"} CollectFindings --> AllDone AllDone -->|"no"| ForEachSkill AllDone -->|"yes"| CheckAllFailed{"All assessments<br/>failed?"} CheckAllFailed -->|"yes"| AllFailStop([Stop: All assessments failed]) CheckAllFailed -->|"no"| StatusAssess["Status: All assessments complete"] %% Step 3: Verify Findings StatusAssess --> IsPlanMode{"Mode = plan?"} IsPlanMode -->|"yes"| SkipVerify["Skip verification<br/>pass findings through unchanged"] IsPlanMode -->|"no"| VerifyLoop["Status: Adversarial verification"] VerifyLoop --> ForEachSkillV["For each skill's findings<br/>(parallel when supported)"] ForEachSkillV --> Classify["Classify findings"] Classify --> PassThrough["PASS + NOT_ASSESSED<br/>verdict = UNCHANGED"] Classify --> Serialize["FAIL + PARTIAL<br/>serialize findings"] Serialize --> HasUnverified{"Any FAIL/PARTIAL<br/>findings?"} HasUnverified -->|"no"| MergeVerified["Merge into verified collection"] HasUnverified -->|"yes"| RunVerifier[/"Subagent: Finding Deep Verifier<br/>all FAIL+PARTIAL in single call"/] RunVerifier -->|"incomplete"| RetryVerifier[/"Retry Verifier (once)"/] RetryVerifier --> CaptureVerdicts["Capture deep verdicts"] RunVerifier -->|"ok"| CaptureVerdicts PassThrough --> MergeVerified CaptureVerdicts --> MergeVerified MergeVerified --> AllVerified{"All skills<br/>verified?"} AllVerified -->|"no"| ForEachSkillV AllVerified -->|"yes"| StatusVerify["Status: All findings verified"] SkipVerify --> StatusVerify %% Step 4: Generate Report StatusVerify --> RunReporter[/"Subagent: Report Generator<br/>mode-specific prompt + verified findings"/] RunReporter --> CaptureReport["Capture report path +<br/>summary counts + severity"] %% Step 5: Completion CaptureReport --> StatusReport["Status: Report generation complete"] StatusReport --> IsPlanReport{"Mode = plan?"} IsPlanReport -->|"yes"| PlanCompletion["Display plan completion format<br/>risk counts + report path"] IsPlanReport -->|"no"| AuditCompletion["Display audit/diff completion format<br/>severity + verification + finding counts"] PlanCompletion --> ExcludedNote{"Excluded skills?"} AuditCompletion --> ExcludedNote ExcludedNote -->|"yes"| AppendNote["Append excluded skills note"] ExcludedNote -->|"no"| Done([Scan Complete]) AppendNote --> Done %% Styling classDef subagent fill:#4a90d9,color:#fff,stroke:#2c5f8a classDef stop fill:#e74c3c,color:#fff,stroke:#c0392b classDef decision fill:#f5c542,color:#333,stroke:#d4a017 classDef status fill:#2ecc71,color:#fff,stroke:#27ae60 class RunProfiler,RunAssessor,RetryAssessor,RunVerifier,RetryVerifier,RunReporter subagent class InvalidStop,EmptyStop,FilterStop,ProfileFail,SkillStop,NoSkillStop,AllFailStop stop class DetectMode,TargetSkill,ValidateSkill,SpecificOverride,CheckEmpty,AllDone,CheckAllFailed,IsPlanMode,HasUnverified,AllVerified,IsPlanReport,ExcludedNote decision class StatusSetup,StatusProfile,StatusAssess,StatusVerify,StatusReport statusRelated Issue(s)
security-revieweragent with OWASP-aligned skill delegation #794owasp-agenticskill for OWASP Agentic Top 10 vulnerability assessment #793owasp-llmskill for OWASP LLM Top 10 vulnerability assessment #796owasp-top-10skill for OWASP Top 10 web vulnerability assessment #795Type of Change
Select all that apply:
Code & Documentation:
Infrastructure & Configuration:
AI Artifacts:
prompt-builderagent and addressed all feedback.github/instructions/*.instructions.md).github/prompts/*.prompt.md).github/agents/*.agent.md).github/skills/*/SKILL.md)Other:
.ps1,.sh,.py)Sample Prompts (for AI Artifact Contributions)
User Request:
Execution Flow:
Security Revieweragent with promptAnalyse the code base and reproduce a detailed security report. By default the agent will run inauditmode. This will do a full audit of the current codebase.Codebase Profileragent -> Create subagents for each identified owasp skill to analyse codebase against owasp skill's knowledge base viaSkill Assessoragent -> New subagents are created for each owasp skill to verify and challenge the agent's findings viaFinding Deep Verifieragent -> Collate results and generate a report viaReport GeneratoragentOutput Artifacts:
auditmode:.copilot-tracking/security/{date}/security-report-001.mddiffmode:.copilot-tracking/security/{date}/security-report-diff-001.mdplanmode:.copilot-tracking/security/{date}/plan-risk-assessment-001.mdSuccess Indicators:
.copilot-tracking/security/{date}/Testing
I had to modify
CollectionHelpers.psm1fornpm run plugin:generateto work. My owasp skills contained a handful of.mdused for reference.CollectionHelpers.psm1would automatically add these.mds to thehev-core-all.collection.yamlwithkind: "0".kind: "0"is not a recongisedkindand would cause an error and updatingkindtoskillwould just get overridden when you runnpm run plugin:generateagain. To resolve this I updated the script to ignore.mdunder theskillsfolderChecklist
Required Checks
AI Artifact Contributions
/prompt-analyzeto review contributionprompt-builderreviewRequired Automated Checks
The following validation commands must pass before merging:
npm run lint:mdnpm run spell-checknpm run lint:frontmatternpm run validate:skillsnpm run lint:md-linksnpm run lint:psnpm run plugin:generateSecurity Considerations
Additional Notes