Skip to content

fix: embedded CLI ignores SessionConfig.Model when user has model in settings.json or experiment flights #262

Description

@sebastienlevert

Summary

The embedded Copilot CLI ignores SessionConfig.Model when the user has a model preference in ~/.copilot/settings.json or is assigned to a Copilot experiment flight (e.g. copilot_cli_opus_1m_default_model). This causes evals to run on unintended models.

Impact

When a user's settings.json contains:

{
  "model": "claude-opus-4.6-1m",
  "effortLevel": "high"
}

All waza eval tasks run on Opus 4.6 1M with high reasoning regardless of config.model: claude-sonnet-4.5 in eval.yaml. A task that completes in 30 seconds with Sonnet takes 15+ minutes with Opus, and the user has no indication the wrong model is being used.

Root Cause

The embedded CLI (v1.0.46 from SDK v1.0.0-beta.4) resolves its default model at process startup from:

  1. Copilot experiment flights (server-assigned per-account)
  2. ~/.copilot/settings.json model field

These are set as config_model in the CLI's telemetry before any SDK session is created. When waza creates a session via SessionConfig{Model: "claude-sonnet-4.5"}, the CLI's startup-level model takes precedence.

Workaround

Passing --model <model> via ClientOptions.CLIArgs forces the CLI arg-level override, which takes precedence over both settings.json and experiment flights:

copilotOptions := &copilot.ClientOptions{
    CLIArgs: []string{"--model", defaultModelID},
    // ...
}

Expected Behavior

SessionConfig.Model should be the authoritative model for that session, overriding any user-level defaults from settings.json or experiment flights. The current behavior makes eval results non-reproducible across users with different settings.

Environment

  • copilot-sdk/go v1.0.0-beta.4
  • Embedded CLI v1.0.46
  • Windows 11, Copilot CLI 1.0.51-2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions