docs: Add CI/CD integration guide (GitHub Actions, Azure DevOps) by spboyer · Pull Request #97 · microsoft/waza

spboyer · 2026-03-05T13:52:27Z

Closes #89

Adds comprehensive CI/CD integration guide to the GitHub Pages docs site at site/src/content/docs/guides/ci-cd.mdx covering:

GitHub Actions workflows
Azure DevOps pipelines
GitLab CI examples
Best practices for waza in CI

- Add decision entry (2026-03-05): Token Diff Distribution Strategy (Issue microsoft#81) - Approved: CLI-first architecture with optional thin wrapper action - Implementation: `waza tokens diff` command + `.github/actions/token-diff` - Rationale: Serves all users (GitHub, non-GitHub CI, azd) without vendor lock-in - Add session log: 2026-03-05T00-26-rusty-token-diff-design.md - Outcome: CLI + wrapper recommendation - Key design choices documented - Related Issue microsoft#81 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Merge 3 inbox decisions into decisions.md; delete inbox files - Write orchestration log for 4 waves (triage, CI guide, PR gen, verification) - Create session log documenting full pipeline activation across 7 agents - Captured: PR reviews, issue triage, multi-platform CI/CD guide, 5 new feature PRs - Status: 3 PRs approved, 1 rebased, 2 follow-ups in flight; no blockers Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Expands the docs site’s CI/CD guidance into a multi-platform “CI/CD Integration” guide, aiming to show how to install and run waza in common pipeline systems (GitHub Actions, Azure DevOps, GitLab CI) and capture best practices/troubleshooting.

Changes:

Rewrites ci-cd.mdx into a comprehensive CI/CD integration guide with examples for GitHub Actions, Azure DevOps, and GitLab CI.
Updates the site lockfile (minor metadata changes).
Adds/updates multiple .squad/ session and orchestration logs documenting the parallel-work pipeline.

Reviewed changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
site/src/content/docs/guides/ci-cd.mdx	Major expansion of CI/CD documentation (install methods, platform examples, best practices, troubleshooting).
site/package-lock.json	Lockfile metadata updates (adds `peer: true` flags to some entries).
.squad/orchestration-log/2026-03-05T00-37-pipeline-wave-4.md	New orchestration log entry for wave 4.
.squad/orchestration-log/2026-03-05T00-37-pipeline-wave-3.md	New orchestration log entry for wave 3.
.squad/orchestration-log/2026-03-05T00-37-pipeline-wave-2.md	New orchestration log entry for wave 2.
.squad/orchestration-log/2026-03-05T00-37-pipeline-wave-1.md	New orchestration log entry for wave 1.
.squad/log/2026-03-05T00-36-issue-assignment-pipeline.md	New session log capturing the assignment pipeline directive.
.squad/log/2026-03-05T00-36-full-pipeline-activation.md	New session log summarizing the full pipeline activation.
.squad/log/2026-03-05T00-26-rusty-token-diff-design.md	New session note documenting token-diff distribution strategy.
.squad/decisions.md	Appends multiple decisions and summaries related to the pipeline and docs work.
.squad/agents/rusty/history.md	Adds a new “Batch PR Review & Issue Triage” history entry (currently dated in the future).
.squad/agents/livingston/history.md	Adds a completion summary for Issue #89 and notes about doc structure/patterns.
.squad/agents/linus/history.md	Adds notes about Windows CRLF stability and a test-variant reminder.

Files not reviewed (1)

site/package-lock.json: Language not supported

Comments suppressed due to low confidence (2)

site/src/content/docs/guides/ci-cd.mdx:265

This Azure DevOps snippet assumes waza tokens diff --ref1/--ref2 and parses .exceeded from the JSON output. The implemented command is waza tokens compare [ref1] [ref2], and its JSON report nests budget info under summary.exceededCount (no top-level exceeded boolean). Update the command and jq query (or rely on --strict exit code) so the pipeline gates correctly.

- script: |
    waza tokens diff \
      --ref1 origin/main \
      --ref2 HEAD \
      --format json > token-diff.json
    
    EXCEEDED=$(jq '.exceeded' token-diff.json)
    if [ "$EXCEEDED" == "true" ]; then
      echo "##vso[task.logissue type=error;]Token budget exceeded"
      exit 1
    fi

site/src/content/docs/guides/ci-cd.mdx:6

Aside is imported here but never used in the document. Remove the unused import or add the intended <Aside ...> blocks so the MDX stays tidy and avoids unused-import warnings in tooling.

import { Tabs, TabItem, Aside } from '@astrojs/starlight/components';

Copilot · 2026-03-05T14:02:00Z

+Use `waza tokens diff` to track token usage across PRs and fail if budgets are exceeded:

 ```yaml
-strategy:
-  matrix:
-    model:
-      - gpt-4o
-      - claude-sonnet-4.6
-      - claude-opus-4
-  max-parallel: 3
-
-steps:
-  - name: Run evals for ${{ matrix.model }}
-    run: waza run --model "${{ matrix.model }}" -o "results-${{ matrix.model }}.json"
+- name: Check token budget
+  run: |
+    waza tokens diff origin/main HEAD \
+      --format table \
+      --strict


The CLI in this repo currently provides waza tokens compare [ref1] [ref2] (with --format and --strict), not waza tokens diff. Also, tokens compare takes refs as positional args (no --ref1/--ref2). Update these examples to match the actual command so they’re copy/pasteable.

This issue also appears on line 255 of the same file.

See below for a potential fix:

Use `waza tokens compare` to track token usage across PRs and fail if budgets are exceeded: ```yaml - name: Check token budget run: | waza tokens compare origin/main HEAD \

Copilot · 2026-03-05T14:02:00Z

+
+# Windows (PowerShell)
+$ProgressPreference = 'SilentlyContinue'
+Invoke-WebRequest -Uri "https://github.com/microsoft/waza/releases/latest" -OutFile waza.exe


The Windows install example downloads https://github.com/microsoft/waza/releases/latest to waza.exe, which will save the HTML redirect page rather than the actual binary. Prefer the existing install.ps1 flow (as documented in site/src/content/docs/reference/releases.mdx) or link to a specific release asset.

Suggested change

Invoke-WebRequest -Uri "https://github.com/microsoft/waza/releases/latest" -OutFile waza.exe

Invoke-WebRequest -Uri "https://raw.githubusercontent.com/microsoft/waza/main/install.ps1" -OutFile install.ps1

./install.ps1

Copilot · 2026-03-05T14:02:01Z

+
+permissions:
+  contents: read
+  pull-requests: write


This workflow posts PR comments via github.rest.issues.createComment, which requires issues: write permission. Currently only pull-requests: write is granted; update the permissions block (or switch to an API that matches the granted scopes) so the comment step works on PRs.

Suggested change

pull-requests: write

pull-requests: write

issues: write

Copilot · 2026-03-05T14:02:01Z

+              issue_number: context.issue.number,
+              owner: context.repo.owner,
+              repo: context.repo.repo,
+              body: `### Eval Results: ${{ matrix.model }}\n\n✅ **Passed:** ${results.passed}\n❌ **Failed:** ${results.failed}`


The PR comment script reads the JSON output and expects results.passed/results.failed, but EvaluationOutcome uses summary.succeeded and summary.failed (and other fields under summary). Adjust the field names (or use waza run --format github-comment to generate a ready-to-post Markdown comment) so the posted counts aren’t undefined.

Suggested change

body: `### Eval Results: ${{ matrix.model }}\n\n✅ **Passed:** ${results.passed}\n❌ **Failed:** ${results.failed}`

body: `### Eval Results: ${{ matrix.model }}\n\n✅ **Passed:** ${results.summary.succeeded}\n❌ **Failed:** ${results.summary.failed}`

Copilot · 2026-03-05T14:02:01Z


-## Troubleshooting
+```bash
+waza tokens check ./evals/my-skill/eval.yaml


waza tokens check validates token limits for markdown files (e.g., SKILL.md or skill directories). Passing an eval spec path like ./evals/my-skill/eval.yaml won’t do what this section suggests. Update the example to point at the skill content you want to budget (for example, the skill directory or SKILL.md).

Suggested change

waza tokens check ./evals/my-skill/eval.yaml

waza tokens check ./evals/my-skill/SKILL.md

Copilot · 2026-03-05T14:02:01Z

+```bash
+waza compare baseline-eval.yaml --model gpt-4o
+```


The waza compare command compares result JSON files (waza compare <result1.json> <result2.json> ...) and doesn’t take --model or an eval YAML path. This section’s example (waza compare baseline-eval.yaml --model ...) won’t work as written; update it to show running evaluations first (producing JSON outputs) and then comparing those outputs.

See below for a potential fix:

Then in CI, run both evaluations and compare their JSON result files: ```bash # Run baseline model waza run baseline-eval.yaml --output baseline-results.json # Run same eval with a different model waza run baseline-eval.yaml --model gpt-4o --output gpt-4o-results.json # Compare the two runs waza compare baseline-results.json gpt-4o-results.json

Copilot · 2026-03-05T14:02:02Z

 ---
 title: CI/CD Integration
-description: Automate evaluation runs in GitHub Actions.
+description: Integrate waza into your CI/CD pipeline — GitHub Actions, Azure DevOps, GitLab CI, and more.
 ---


This guide updates guides/ci-cd.mdx, but Issue #89 acceptance criteria references a new page ci-integration.mdx. If the canonical URL is now guides/ci-cd, please update the issue/PR text or add a redirect/alias so readers don’t end up with mismatched links and sidebar entries.

codecov-commenter · 2026-03-05T15:21:17Z

Codecov Report

❌ Patch coverage is 84.55882% with 21 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@3460e25). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
internal/scaffold/writer.go	84.37%	7 Missing and 3 partials ⚠️
cmd/waza/cmd_new.go	83.92%	5 Missing and 4 partials ⚠️
cmd/waza/cmd_init.go	86.66%	1 Missing and 1 partial ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##             main      #97   +/-   ##
=======================================
  Coverage        ?   72.21%           
=======================================
  Files           ?      128           
  Lines           ?    14252           
  Branches        ?        0           
=======================================
  Hits            ?    10292           
  Misses          ?     3196           
  Partials        ?      764

Flag	Coverage Δ
go-implementation	`72.21% <84.55%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

) * fix: add regression test and CHANGELOG for waza suggest deadlock fix Co-authored-by: spboyer <7681382+spboyer@users.noreply.github.com> * fix: correct misspelling cancelled→canceled for lint Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: spboyer <7681382+spboyer@users.noreply.github.com> Co-authored-by: Shayne Boyer <spboyer@live.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…soft#56) * fix: make site base path configurable + remove unused workflow - Make astro.config.mjs read site/base from ASTRO_SITE and ASTRO_BASE env vars with GitHub Pages defaults. Azure SWA can now build with ASTRO_BASE=/ so images resolve correctly at root. - Remove unused squad-heartbeat.yml workflow. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: normalize base path and prevent // in nav links Address review feedback: normalize ASTRO_BASE to prevent double-slash URLs when base is '/'. Also fix Header.astro to strip trailing slash from BASE_URL before concatenating nav link paths. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: repair broken TestCopilotExecute_StartRespectsTimeout mock setup The test was mocking Start() expecting Execute() to call it, but Execute() calls CreateSession() directly — Start() is only called by Initialize(). Fixed to test Initialize() error propagation directly, which is the actual regression guard for the deadlock fix. This test failure was blocking CI on all open PRs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: address review feedback — rename test, use errors.New - Rename TestCopilotExecute_StartRespectsTimeout to TestCopilotExecute_InitializePropagatesStartError to match what the test actually asserts - Use errors.New instead of fmt.Errorf for constant string (staticcheck S1028) - Remove unused fmt import Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…ft#48 (microsoft#63) * feat: add FileWriter service and refactor waza init inventory microsoft#48 - Create internal/scaffold/writer.go with FileWriter type that encapsulates the create-if-missing + skip-if-exists pattern - FileWriter returns structured Inventory with per-entry outcomes (created/skipped) - Inventory.Fprint() renders aligned table with emoji indicators: ➕ for created, ✅ (already exists) for skipped - Refactor cmd/waza/cmd_init.go to use FileWriter instead of inline write loop - Inventory is always visible (not gated behind --verbose) - Add 8 tests in writer_test.go covering: create-if-missing, skip-if-exists, mixed outcomes, parent directory creation, inventory output, relative paths, empty content handling, and CreatedCount Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: update Linus history with FileWriter work (microsoft#48) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: refactor waza new to use shared FileWriter microsoft#58 (microsoft#66) * feat: refactor waza new to use shared FileWriter microsoft#58 Replace the inline write loop in cmd_new.go with the shared FileWriter from internal/scaffold/writer.go. Malformed SKILL.md detection still runs before FileWriter — the file is removed so FileWriter creates it fresh. Inventory now uses consistent ➕/✅ emoji indicators (always visible, not gated behind --verbose), matching the waza init behavior. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore: update squad state for microsoft#58 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore: remove .squad/ files from PR branch Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore: remove .squad/ files from PR branch Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: improve error handling in FileWriter stat checks microsoft#48 - Directory branch: explicitly handle IsNotExist vs other stat errors - File branch: detect directory-at-file-path type mismatch - Both branches: return errors on permission failures instead of masking - Add regression tests for type-mismatch error paths Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: improve error handling in FileWriter stat checks microsoft#48 - Directory branch: explicitly handle IsNotExist vs other stat errors - File branch: detect directory-at-file-path type mismatch - Both branches: return errors on permission failures instead of masking - Add regression tests for type-mismatch error paths Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: improve error handling in FileWriter stat checks microsoft#48 - Directory branch: explicitly handle IsNotExist vs other stat errors - Directory branch: error when path exists but is not a directory - File branch: detect directory-at-file-path type mismatch - Both branches: return errors on permission failures instead of masking - Add regression tests for type-mismatch error paths Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: gofmt writer_test.go Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Bumps [svgo](https://github.com/svg/svgo) from 4.0.0 to 4.0.1. - [Release notes](https://github.com/svg/svgo/releases) - [Commits](svg/svgo@v4.0.0...v4.0.1) --- updated-dependencies: - dependency-name: svgo dependency-version: 4.0.1 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

spboyer · 2026-03-05T15:52:14Z

Closing — rebuilding with clean branch (had leaked content from other PRs).

…t#159) Closes microsoft#97 MockEngine now creates a temp workspace directory and sets `ExecutionResponse.WorkspaceDir`, enabling FileGrader and other workspace-dependent graders to work in test scenarios. ### Changes - **`workspace.go`** — Extract `setupWorkspaceResources()` as a shared package-level helper with path-traversal protection - **`mock.go`** — `Execute()` creates a temp dir, populates resources, sets `WorkspaceDir`; `Shutdown()` cleans it up - **`copilot.go`** — `setupResources()` now delegates to the shared helper (no behaviour change) ### How it works Same pattern as CopilotEngine: each `Execute()` call creates a fresh `waza-mock-*` temp directory, writes request resources into it, and sets `WorkspaceDir` on the response. `Shutdown()` removes it. All existing tests pass. --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

spboyer and others added 4 commits March 4, 2026 19:30

chore(squad): capture model directive + session log

fad8bf3

docs: add CI/CD integration guide microsoft#89

35e3d73

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

spboyer requested a review from chlowell as a code owner March 5, 2026 13:52

Copilot AI review requested due to automatic review settings March 5, 2026 13:52

spboyer requested a review from richardpark-msft as a code owner March 5, 2026 13:52

spboyer self-assigned this Mar 5, 2026

github-actions Bot enabled auto-merge (squash) March 5, 2026 13:52

Copilot started reviewing on behalf of spboyer March 5, 2026 13:53 View session

Copilot AI reviewed Mar 5, 2026

View reviewed changes

Copilot AI and others added 6 commits March 5, 2026 10:32

fix: Standardize emoji spacing in waza check display (microsoft#45)

0970399

spboyer force-pushed the squad/89-ci-docs-guide branch from 1bc0b8a to c92a810 Compare March 5, 2026 15:32

spboyer closed this Mar 5, 2026

auto-merge was automatically disabled March 5, 2026 15:52
Pull request was closed

spboyer mentioned this pull request Mar 5, 2026

docs: Add CI/CD integration guide (GitHub Actions, Azure DevOps) #100

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: Add CI/CD integration guide (GitHub Actions, Azure DevOps)#97

docs: Add CI/CD integration guide (GitHub Actions, Azure DevOps)#97
spboyer wants to merge 10 commits into
microsoft:mainfrom
spboyer:squad/89-ci-docs-guide

spboyer commented Mar 5, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 5, 2026

Uh oh!

Copilot AI Mar 5, 2026

Uh oh!

Copilot AI Mar 5, 2026

Uh oh!

Copilot AI Mar 5, 2026

Uh oh!

Copilot AI Mar 5, 2026

Uh oh!

Copilot AI Mar 5, 2026

Uh oh!

Copilot AI Mar 5, 2026

Uh oh!

codecov-commenter commented Mar 5, 2026 •

edited

Loading

Uh oh!

spboyer commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

	Invoke-WebRequest -Uri "https://github.com/microsoft/waza/releases/latest" -OutFile waza.exe
	Invoke-WebRequest -Uri "https://raw.githubusercontent.com/microsoft/waza/main/install.ps1" -OutFile install.ps1
	./install.ps1

	body: `### Eval Results: ${{ matrix.model }}\n\n✅ Passed: ${results.passed}\n❌ Failed: ${results.failed}`
	body: `### Eval Results: ${{ matrix.model }}\n\n✅ Passed: ${results.summary.succeeded}\n❌ Failed: ${results.summary.failed}`

	waza tokens check ./evals/my-skill/eval.yaml
	waza tokens check ./evals/my-skill/SKILL.md

Uh oh!

Conversation

spboyer commented Mar 5, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

spboyer commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

codecov-commenter commented Mar 5, 2026 •

edited

Loading