Skip to content

docs: Add CI/CD integration guide (GitHub Actions, Azure DevOps)#97

Closed
spboyer wants to merge 10 commits into
microsoft:mainfrom
spboyer:squad/89-ci-docs-guide
Closed

docs: Add CI/CD integration guide (GitHub Actions, Azure DevOps)#97
spboyer wants to merge 10 commits into
microsoft:mainfrom
spboyer:squad/89-ci-docs-guide

Conversation

@spboyer

@spboyer spboyer commented Mar 5, 2026

Copy link
Copy Markdown
Member

Closes #89

Adds comprehensive CI/CD integration guide to the GitHub Pages docs site at site/src/content/docs/guides/ci-cd.mdx covering:

  • GitHub Actions workflows
  • Azure DevOps pipelines
  • GitLab CI examples
  • Best practices for waza in CI

spboyer and others added 4 commits March 4, 2026 19:30
- Add decision entry (2026-03-05): Token Diff Distribution Strategy (Issue microsoft#81)
  - Approved: CLI-first architecture with optional thin wrapper action
  - Implementation: `waza tokens diff` command + `.github/actions/token-diff`
  - Rationale: Serves all users (GitHub, non-GitHub CI, azd) without vendor lock-in

- Add session log: 2026-03-05T00-26-rusty-token-diff-design.md
  - Outcome: CLI + wrapper recommendation
  - Key design choices documented
  - Related Issue microsoft#81

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Merge 3 inbox decisions into decisions.md; delete inbox files
- Write orchestration log for 4 waves (triage, CI guide, PR gen, verification)
- Create session log documenting full pipeline activation across 7 agents
- Captured: PR reviews, issue triage, multi-platform CI/CD guide, 5 new feature PRs
- Status: 3 PRs approved, 1 rebased, 2 follow-ups in flight; no blockers

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@spboyer spboyer requested a review from chlowell as a code owner March 5, 2026 13:52
Copilot AI review requested due to automatic review settings March 5, 2026 13:52
@spboyer spboyer self-assigned this Mar 5, 2026
@github-actions github-actions Bot enabled auto-merge (squash) March 5, 2026 13:52

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Expands the docs site’s CI/CD guidance into a multi-platform “CI/CD Integration” guide, aiming to show how to install and run waza in common pipeline systems (GitHub Actions, Azure DevOps, GitLab CI) and capture best practices/troubleshooting.

Changes:

  • Rewrites ci-cd.mdx into a comprehensive CI/CD integration guide with examples for GitHub Actions, Azure DevOps, and GitLab CI.
  • Updates the site lockfile (minor metadata changes).
  • Adds/updates multiple .squad/ session and orchestration logs documenting the parallel-work pipeline.

Reviewed changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
site/src/content/docs/guides/ci-cd.mdx Major expansion of CI/CD documentation (install methods, platform examples, best practices, troubleshooting).
site/package-lock.json Lockfile metadata updates (adds peer: true flags to some entries).
.squad/orchestration-log/2026-03-05T00-37-pipeline-wave-4.md New orchestration log entry for wave 4.
.squad/orchestration-log/2026-03-05T00-37-pipeline-wave-3.md New orchestration log entry for wave 3.
.squad/orchestration-log/2026-03-05T00-37-pipeline-wave-2.md New orchestration log entry for wave 2.
.squad/orchestration-log/2026-03-05T00-37-pipeline-wave-1.md New orchestration log entry for wave 1.
.squad/log/2026-03-05T00-36-issue-assignment-pipeline.md New session log capturing the assignment pipeline directive.
.squad/log/2026-03-05T00-36-full-pipeline-activation.md New session log summarizing the full pipeline activation.
.squad/log/2026-03-05T00-26-rusty-token-diff-design.md New session note documenting token-diff distribution strategy.
.squad/decisions.md Appends multiple decisions and summaries related to the pipeline and docs work.
.squad/agents/rusty/history.md Adds a new “Batch PR Review & Issue Triage” history entry (currently dated in the future).
.squad/agents/livingston/history.md Adds a completion summary for Issue #89 and notes about doc structure/patterns.
.squad/agents/linus/history.md Adds notes about Windows CRLF stability and a test-variant reminder.
Files not reviewed (1)
  • site/package-lock.json: Language not supported
Comments suppressed due to low confidence (2)

site/src/content/docs/guides/ci-cd.mdx:265

  • This Azure DevOps snippet assumes waza tokens diff --ref1/--ref2 and parses .exceeded from the JSON output. The implemented command is waza tokens compare [ref1] [ref2], and its JSON report nests budget info under summary.exceededCount (no top-level exceeded boolean). Update the command and jq query (or rely on --strict exit code) so the pipeline gates correctly.
- script: |
    waza tokens diff \
      --ref1 origin/main \
      --ref2 HEAD \
      --format json > token-diff.json
    
    EXCEEDED=$(jq '.exceeded' token-diff.json)
    if [ "$EXCEEDED" == "true" ]; then
      echo "##vso[task.logissue type=error;]Token budget exceeded"
      exit 1
    fi

site/src/content/docs/guides/ci-cd.mdx:6

  • Aside is imported here but never used in the document. Remove the unused import or add the intended <Aside ...> blocks so the MDX stays tidy and avoids unused-import warnings in tooling.
import { Tabs, TabItem, Aside } from '@astrojs/starlight/components';

Comment on lines +129 to +136
Use `waza tokens diff` to track token usage across PRs and fail if budgets are exceeded:

```yaml
strategy:
matrix:
model:
- gpt-4o
- claude-sonnet-4.6
- claude-opus-4
max-parallel: 3

steps:
- name: Run evals for ${{ matrix.model }}
run: waza run --model "${{ matrix.model }}" -o "results-${{ matrix.model }}.json"
- name: Check token budget
run: |
waza tokens diff origin/main HEAD \
--format table \
--strict

Copilot AI Mar 5, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CLI in this repo currently provides waza tokens compare [ref1] [ref2] (with --format and --strict), not waza tokens diff. Also, tokens compare takes refs as positional args (no --ref1/--ref2). Update these examples to match the actual command so they’re copy/pasteable.

This issue also appears on line 255 of the same file.

See below for a potential fix:

Use `waza tokens compare` to track token usage across PRs and fail if budgets are exceeded:

```yaml
- name: Check token budget
  run: |
    waza tokens compare origin/main HEAD \

Copilot uses AI. Check for mistakes.

# Windows (PowerShell)
$ProgressPreference = 'SilentlyContinue'
Invoke-WebRequest -Uri "https://github.com/microsoft/waza/releases/latest" -OutFile waza.exe

Copilot AI Mar 5, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Windows install example downloads https://github.com/microsoft/waza/releases/latest to waza.exe, which will save the HTML redirect page rather than the actual binary. Prefer the existing install.ps1 flow (as documented in site/src/content/docs/reference/releases.mdx) or link to a specific release asset.

Suggested change
Invoke-WebRequest -Uri "https://github.com/microsoft/waza/releases/latest" -OutFile waza.exe
Invoke-WebRequest -Uri "https://raw.githubusercontent.com/microsoft/waza/main/install.ps1" -OutFile install.ps1
./install.ps1

Copilot uses AI. Check for mistakes.

permissions:
contents: read
pull-requests: write

Copilot AI Mar 5, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This workflow posts PR comments via github.rest.issues.createComment, which requires issues: write permission. Currently only pull-requests: write is granted; update the permissions block (or switch to an API that matches the granted scopes) so the comment step works on PRs.

Suggested change
pull-requests: write
pull-requests: write
issues: write

Copilot uses AI. Check for mistakes.
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: `### Eval Results: ${{ matrix.model }}\n\n✅ **Passed:** ${results.passed}\n❌ **Failed:** ${results.failed}`

Copilot AI Mar 5, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR comment script reads the JSON output and expects results.passed/results.failed, but EvaluationOutcome uses summary.succeeded and summary.failed (and other fields under summary). Adjust the field names (or use waza run --format github-comment to generate a ready-to-post Markdown comment) so the posted counts aren’t undefined.

Suggested change
body: `### Eval Results: ${{ matrix.model }}\n\n✅ **Passed:** ${results.passed}\n❌ **Failed:** ${results.failed}`
body: `### Eval Results: ${{ matrix.model }}\n\n✅ **Passed:** ${results.summary.succeeded}\n❌ **Failed:** ${results.summary.failed}`

Copilot uses AI. Check for mistakes.

## Troubleshooting
```bash
waza tokens check ./evals/my-skill/eval.yaml

Copilot AI Mar 5, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

waza tokens check validates token limits for markdown files (e.g., SKILL.md or skill directories). Passing an eval spec path like ./evals/my-skill/eval.yaml won’t do what this section suggests. Update the example to point at the skill content you want to budget (for example, the skill directory or SKILL.md).

Suggested change
waza tokens check ./evals/my-skill/eval.yaml
waza tokens check ./evals/my-skill/SKILL.md

Copilot uses AI. Check for mistakes.
Comment on lines +488 to +490
```bash
waza compare baseline-eval.yaml --model gpt-4o
```

Copilot AI Mar 5, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The waza compare command compares result JSON files (waza compare <result1.json> <result2.json> ...) and doesn’t take --model or an eval YAML path. This section’s example (waza compare baseline-eval.yaml --model ...) won’t work as written; update it to show running evaluations first (producing JSON outputs) and then comparing those outputs.

See below for a potential fix:

Then in CI, run both evaluations and compare their JSON result files:

```bash
# Run baseline model
waza run baseline-eval.yaml --output baseline-results.json

# Run same eval with a different model
waza run baseline-eval.yaml --model gpt-4o --output gpt-4o-results.json

# Compare the two runs
waza compare baseline-results.json gpt-4o-results.json

Copilot uses AI. Check for mistakes.
Comment on lines 1 to 4
---
title: CI/CD Integration
description: Automate evaluation runs in GitHub Actions.
description: Integrate waza into your CI/CD pipeline — GitHub Actions, Azure DevOps, GitLab CI, and more.
---

Copilot AI Mar 5, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This guide updates guides/ci-cd.mdx, but Issue #89 acceptance criteria references a new page ci-integration.mdx. If the canonical URL is now guides/ci-cd, please update the issue/PR text or add a redirect/alias so readers don’t end up with mismatched links and sidebar entries.

Copilot uses AI. Check for mistakes.
@codecov-commenter

codecov-commenter commented Mar 5, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 84.55882% with 21 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@3460e25). Learn more about missing BASE report.

Files with missing lines Patch % Lines
internal/scaffold/writer.go 84.37% 7 Missing and 3 partials ⚠️
cmd/waza/cmd_new.go 83.92% 5 Missing and 4 partials ⚠️
cmd/waza/cmd_init.go 86.66% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main      #97   +/-   ##
=======================================
  Coverage        ?   72.21%           
=======================================
  Files           ?      128           
  Lines           ?    14252           
  Branches        ?        0           
=======================================
  Hits            ?    10292           
  Misses          ?     3196           
  Partials        ?      764           
Flag Coverage Δ
go-implementation 72.21% <84.55%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copilot AI and others added 6 commits March 5, 2026 10:32
)

* fix: add regression test and CHANGELOG for waza suggest deadlock fix

Co-authored-by: spboyer <7681382+spboyer@users.noreply.github.com>

* fix: correct misspelling cancelled→canceled for lint

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: spboyer <7681382+spboyer@users.noreply.github.com>
Co-authored-by: Shayne Boyer <spboyer@live.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…soft#56)

* fix: make site base path configurable + remove unused workflow

- Make astro.config.mjs read site/base from ASTRO_SITE and ASTRO_BASE
  env vars with GitHub Pages defaults. Azure SWA can now build with
  ASTRO_BASE=/ so images resolve correctly at root.
- Remove unused squad-heartbeat.yml workflow.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: normalize base path and prevent // in nav links

Address review feedback: normalize ASTRO_BASE to prevent double-slash
URLs when base is '/'. Also fix Header.astro to strip trailing slash
from BASE_URL before concatenating nav link paths.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix: repair broken TestCopilotExecute_StartRespectsTimeout mock setup

The test was mocking Start() expecting Execute() to call it, but
Execute() calls CreateSession() directly — Start() is only called
by Initialize(). Fixed to test Initialize() error propagation
directly, which is the actual regression guard for the deadlock fix.

This test failure was blocking CI on all open PRs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: address review feedback — rename test, use errors.New

- Rename TestCopilotExecute_StartRespectsTimeout to
  TestCopilotExecute_InitializePropagatesStartError to match
  what the test actually asserts
- Use errors.New instead of fmt.Errorf for constant string (staticcheck S1028)
- Remove unused fmt import

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ft#48 (microsoft#63)

* feat: add FileWriter service and refactor waza init inventory microsoft#48

- Create internal/scaffold/writer.go with FileWriter type that encapsulates
  the create-if-missing + skip-if-exists pattern
- FileWriter returns structured Inventory with per-entry outcomes (created/skipped)
- Inventory.Fprint() renders aligned table with emoji indicators:
  ➕ for created, ✅ (already exists) for skipped
- Refactor cmd/waza/cmd_init.go to use FileWriter instead of inline write loop
- Inventory is always visible (not gated behind --verbose)
- Add 8 tests in writer_test.go covering: create-if-missing, skip-if-exists,
  mixed outcomes, parent directory creation, inventory output, relative paths,
  empty content handling, and CreatedCount

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: update Linus history with FileWriter work (microsoft#48)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat: refactor waza new to use shared FileWriter microsoft#58 (microsoft#66)

* feat: refactor waza new to use shared FileWriter microsoft#58

Replace the inline write loop in cmd_new.go with the shared FileWriter
from internal/scaffold/writer.go. Malformed SKILL.md detection still runs
before FileWriter — the file is removed so FileWriter creates it fresh.

Inventory now uses consistent ➕/✅ emoji indicators (always visible,
not gated behind --verbose), matching the waza init behavior.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore: update squad state for microsoft#58

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore: remove .squad/ files from PR branch

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore: remove .squad/ files from PR branch

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: improve error handling in FileWriter stat checks microsoft#48

- Directory branch: explicitly handle IsNotExist vs other stat errors
- File branch: detect directory-at-file-path type mismatch
- Both branches: return errors on permission failures instead of masking
- Add regression tests for type-mismatch error paths

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: improve error handling in FileWriter stat checks microsoft#48

- Directory branch: explicitly handle IsNotExist vs other stat errors
- File branch: detect directory-at-file-path type mismatch
- Both branches: return errors on permission failures instead of masking
- Add regression tests for type-mismatch error paths

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: improve error handling in FileWriter stat checks microsoft#48

- Directory branch: explicitly handle IsNotExist vs other stat errors
- Directory branch: error when path exists but is not a directory
- File branch: detect directory-at-file-path type mismatch
- Both branches: return errors on permission failures instead of masking
- Add regression tests for type-mismatch error paths

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: gofmt writer_test.go

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Bumps [svgo](https://github.com/svg/svgo) from 4.0.0 to 4.0.1.
- [Release notes](https://github.com/svg/svgo/releases)
- [Commits](svg/svgo@v4.0.0...v4.0.1)

---
updated-dependencies:
- dependency-name: svgo
  dependency-version: 4.0.1
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
@spboyer spboyer force-pushed the squad/89-ci-docs-guide branch from 1bc0b8a to c92a810 Compare March 5, 2026 15:32
@spboyer

spboyer commented Mar 5, 2026

Copy link
Copy Markdown
Member Author

Closing — rebuilding with clean branch (had leaked content from other PRs).

@spboyer spboyer closed this Mar 5, 2026
auto-merge was automatically disabled March 5, 2026 15:52

Pull request was closed

chlowell pushed a commit to chlowell/waza that referenced this pull request Mar 5, 2026
…t#159)

Closes microsoft#97

MockEngine now creates a temp workspace directory and sets
`ExecutionResponse.WorkspaceDir`, enabling FileGrader and other
workspace-dependent graders to work in test scenarios.

### Changes
- **`workspace.go`** — Extract `setupWorkspaceResources()` as a shared
package-level helper with path-traversal protection
- **`mock.go`** — `Execute()` creates a temp dir, populates resources,
sets `WorkspaceDir`; `Shutdown()` cleans it up
- **`copilot.go`** — `setupResources()` now delegates to the shared
helper (no behaviour change)

### How it works
Same pattern as CopilotEngine: each `Execute()` call creates a fresh
`waza-mock-*` temp directory, writes request resources into it, and sets
`WorkspaceDir` on the response. `Shutdown()` removes it.

All existing tests pass.

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

docs: Add CI/CD integration guide (GitHub Actions, Azure DevOps)

5 participants