feat: add per-trial usage to results JSON#277
Merged
Conversation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Adds per-run usage information to the evaluation results JSON so token/request usage can be analyzed at the trial level while keeping the existing aggregate usage summary.
Changes:
- Add
usageto eachtasks[].runs[]entry in results JSON (models.RunResult). - Populate per-run
usageduring post-shutdown usage finalization and re-aggregatesummary.usagefrom per-run usage. - Update docs to mention the new run-level
usageblock.
Show a summary per file
| File | Description |
|---|---|
| site/src/content/docs/reference/statistical-fields.mdx | Mentions new per-run usage alongside existing aggregate usage. |
| internal/models/outcome.go | Adds RunResult.Usage field to results JSON model. |
| internal/execution/usage.go | Populates RunResult.Usage from session usage and re-aggregates digest usage totals. |
| internal/execution/usage_test.go | Extends tests to validate per-run usage population and JSON serialization. |
Copilot's findings
- Files reviewed: 4/4 changed files
- Comments generated: 2
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
5d36a81 to
2d6414b
Compare
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #272
Summary
usagefield to eachtasks[].runs[]entry in results JSON.summary.usageblock.Validation
go test ./internal/execution ./internal/orchestration ./cmd/wazago test ./...cd site && npm ci && npm run buildDocumentation impact
site/src/content/docs/reference/statistical-fields.mdxto document the new run-level usage block.