-
Notifications
You must be signed in to change notification settings - Fork 63
Comparing changes
Open a pull request
base repository: microsoft/waza
base: v0.27.0
head repository: microsoft/waza
compare: v0.28.0
- 8 commits
- 28 files changed
- 2 contributors
Commits on Apr 21, 2026
-
fix: CI integration test allows eval failures with mock executor (#210)
The integration test step runs `waza run` with the mock executor, which produces generic output that won't match output_contains expectations. This is expected — the test validates that waza completes without crashing, not that mock evals pass. Root cause: PR #203 (v0.27.0) wired up evaluateExpectations() which made output_contains checks actually execute. Before that, these fields were defined but never evaluated, so the integration test passed silently. Exit code 1 (eval failures) is now allowed. Exit codes >1 (crashes, panics) still fail CI. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 75b2538 - Browse repository at this point
Copy the full SHA 75b2538View commit details -
fix: use valid Python expression in test fixture assertion (#197)
Replace placeholder assertion with 'len(output) > 0' which is valid Python syntax for the inline script grader eval_wrapper.py. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for f9c575c - Browse repository at this point
Copy the full SHA f9c575cView commit details -
docs: add Quick Start guide to documentation site (#205)
- Create focused 5-minute Quick Start page at site/src/content/docs/quick-start.mdx - Add installation options (binary, from source, azd extension) - Include authentication, first skill creation, minimal eval YAML - Add Mermaid workflow diagram - Include workflow steps: install → auth → create → write → run → view - Place Quick Start as first item in sidebar navigation - Update homepage to prominently link Quick Start guide - Site builds successfully with 17 pages Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 2c01028 - Browse repository at this point
Copy the full SHA 2c01028View commit details -
fix: audit YAML validation and add TestCase unknown field test (#132) (…
…#206) Audit findings: - All primary user config loaders (LoadBenchmarkSpec, LoadTestCase, ParseSpec, ProjectConfig.Load, suggest.validateEvalYAML, jsonrpc eval validate) already use decoder.KnownFields(true) — strict. - Two yaml.Unmarshal calls in cmd_coverage.go and cmd_check.go are intentional partial parses (only read subset of fields); making them strict would break valid eval.yaml files. - internal/generate, internal/skill, internal/validation use non-strict parsing by design (frontmatter extensibility, schema probing, generic any decode). Changes: - Add TestLoadTestCase_UnknownFieldRejected proving bogus fields are rejected by LoadTestCase's KnownFields(true) decoder. - Remove broken TestLoadTestCase_FollowUpPrompts that referenced non-existent TestStimulus.FollowUps field (prevented package compilation on main). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 29149f6 - Browse repository at this point
Copy the full SHA 29149f6View commit details -
feat: allow trigger tests to terminate early on skill invocation (#207)
* feat: allow trigger tests to terminate early on skill invocation #188 Add CancelOnSkillInvocation flag to ExecutionRequest that cancels the execution context as soon as a SkillInvoked event is received. This allows trigger tests to return immediately once the target skill fires, instead of waiting for the agent to complete its full turn. Implementation: - Add onSkillInvoked callback to SessionEventsCollector - Wire up context cancellation in CopilotEngine.Execute when flag is set - Trigger runner sets CancelOnSkillInvocation=true on all test prompts - Context cancellation from skill invocation is treated as success Tests: - SessionEventsCollector callback fires on skill invocation - CopilotEngine cancels SendAndWait early when skill invoked - CopilotEngine completes normally when no skill fires (flag is safe) - Trigger runner sets the CancelOnSkillInvocation flag Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: address lint and race detector issues - Rename cancelledForSkill to canceledForSkill (American spelling) - Fix 'cancelled' misspelling in comments - Add mutex to capturingEngine to prevent data race in trigger tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 956beaa - Browse repository at this point
Copy the full SHA 956beaaView commit details -
feat: add
waza modelscommand to list available models (#208)* docs: add Quick Start guide to documentation site - Create focused 5-minute Quick Start page at site/src/content/docs/quick-start.mdx - Add installation options (binary, from source, azd extension) - Include authentication, first skill creation, minimal eval YAML - Add Mermaid workflow diagram - Include workflow steps: install → auth → create → write → run → view - Place Quick Start as first item in sidebar navigation - Update homepage to prominently link Quick Start guide - Site builds successfully with 17 pages Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: add `waza models` command to list available models (#141) Add a new `waza models` command that queries the Copilot SDK for available models and displays them as a formatted table (or JSON with --json flag). The table shows model ID, name, vision support, and context window size. Changes: - Add ListModels to CopilotClient interface and copilotClientWrapper - Add ListModels method on CopilotEngine - Regenerate gomock mocks for both internal and cmd packages - Register `models` subcommand in root.go - Handle auth errors gracefully ("run copilot login first") - Add 7 tests covering table output, JSON, empty list, auth errors, backend errors, and token formatting - Update CLI reference docs (site/src/content/docs/reference/cli.mdx) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: gofmt cmd_models.go and cmd_models_test.go Run gofmt to fix formatting issues flagged by CI. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: address lint errors in models command Add //nolint:errcheck for fmt.Fprintln/Fprintf calls matching existing patterns in cmd_check.go. Run gofmt on both files. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 47166fa - Browse repository at this point
Copy the full SHA 47166faView commit details -
feat: support pre-written follow-up prompts in eval YAML #189 (#209)
Add follow_up_prompts field to TestStimulus for multi-turn eval scenarios. Follow-up prompts reuse the same session and workspace, preserving conversation history and file changes across turns. Changes: - Add FollowUps []string to TestStimulus (yaml: follow_up_prompts) - Add WorkspaceDir to ExecutionRequest for workspace reuse - Update CopilotEngine to skip setupWorkspace when WorkspaceDir is set - Update MockEngine to support workspace reuse and SessionID passthrough - Add executeFollowUps() to orchestration runner with result aggregation - Add 3 YAML parsing tests and 5 orchestration tests - Update JSON schema, eval-yaml guide, and schema reference docs Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for c37f404 - Browse repository at this point
Copy the full SHA c37f404View commit details -
chore: update CODEOWNERS to single owner (#211)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for b1acf61 - Browse repository at this point
Copy the full SHA b1acf61View commit details
This comparison is taking too long to generate.
Unfortunately it looks like we can’t render this comparison for you right now. It might be too big, or there might be something weird with your repository.
You can try running this command locally to see the comparison on your machine:
git diff v0.27.0...v0.28.0