Skip to content

Include mode in output filename to avoid overwriting results#17

Merged
aallan merged 2 commits into
mainfrom
fix/output-filename-mode
Mar 30, 2026
Merged

Include mode in output filename to avoid overwriting results#17
aallan merged 2 commits into
mainfrom
fix/output-filename-mode

Conversation

@aallan

@aallan aallan commented Mar 30, 2026

Copy link
Copy Markdown
Owner

full-spec and spec-from-nl were writing to the same JSONL file. Now:

  • vera-bench run --model XX.jsonl
  • vera-bench run --model X --mode spec-from-nlX-spec-from-nl.jsonl
  • vera-bench run --model X --language pythonX-python.jsonl

Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes
    • Refined output filename generation to include mode information in results filenames for non-default configurations, improving file organisation and discoverability.

full-spec (default) produces {model}.jsonl, spec-from-nl produces
{model}-spec-from-nl.jsonl. Language suffix also included when not vera.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@codecov-commenter

codecov-commenter commented Mar 30, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 65.27%. Comparing base (908b435) to head (60f60a0).

Files with missing lines Patch % Lines
vera_bench/cli.py 0.00% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #17      +/-   ##
==========================================
- Coverage   65.56%   65.27%   -0.29%     
==========================================
  Files          10       10              
  Lines         909      913       +4     
==========================================
  Hits          596      596              
- Misses        313      317       +4     
Flag Coverage Δ
python 65.27% <0.00%> (-0.29%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@coderabbitai

coderabbitai Bot commented Mar 30, 2026

Copy link
Copy Markdown

Warning

Rate limit exceeded

@aallan has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 4 minutes and 3 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 4 minutes and 3 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 43e8b075-9b4b-4356-8e9c-3085139282e5

📥 Commits

Reviewing files that changed from the base of the PR and between c0f642f and 60f60a0.

📒 Files selected for processing (1)
  • vera_bench/cli.py
📝 Walkthrough

Walkthrough

Modified the output filename generation logic in the run command to conditionally append both language and mode parameters to the model name, rather than only appending language. The logic now builds a hyphen-joined list of parts and produces filenames like model-language-mode.jsonl depending on parameter values.

Changes

Cohort / File(s) Summary
Filename Generation Logic
vera_bench/cli.py
Changed output filename construction from conditional language suffix to conditional language and mode suffix. Builds parts list starting with model, appending language (when not "vera") and mode (when not "full-spec"), then joins with hyphens.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Suggested labels

harness

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and clearly describes the primary change: including mode in the output filename to prevent overwriting results.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/output-filename-mode

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@vera_bench/cli.py`:
- Around line 143-145: The filename builder currently appends the CLI variable
mode into parts and thus into output_path even when mode is ignored for Python;
update the logic around parts/ output_path so that when language == "python" (or
when the code path that warns about mode being ignored) you do not append mode
to parts — i.e., only append mode when it is actually honored (keep references
to the variables mode, parts, output_path, and output_dir to locate the change)
so filenames no longer reflect an ignored mode.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7278d546-3f5a-4ea3-8586-c2d8f826e63e

📥 Commits

Reviewing files that changed from the base of the PR and between 908b435 and c0f642f.

📒 Files selected for processing (1)
  • vera_bench/cli.py

Comment thread vera_bench/cli.py Outdated
Mode is ignored for Python, so it shouldn't appear in the filename.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@aallan aallan merged commit 6f03419 into main Mar 30, 2026
8 checks passed
@aallan aallan deleted the fix/output-filename-mode branch March 30, 2026 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants