Skip to content

Add CLI evaluation and testing framework for LLM and human usability #7608

Description

@spboyer

Problem

We have no visibility into how GitHub Copilot CLI interacts with azd. There is no coverage for measuring LLM interactions, command discoverability, or human usability patterns.

Proposal

Add a comprehensive evaluation and testing framework at cli/azd/test/eval/ covering:

  • LLM eval (how well an AI agent uses azd)
  • Non-LLM unit tests (how well azd surfaces information for human and AI consumption)

See PR #7202 for implementation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions