Add CLI evaluation and testing framework for LLM and human usability

## Problem

We have no visibility into how GitHub Copilot CLI interacts with `azd`. There is no coverage for measuring LLM interactions, command discoverability, or human usability patterns.

## Proposal

Add a comprehensive evaluation and testing framework at `cli/azd/test/eval/` covering:
- LLM eval (how well an AI agent uses azd)
- Non-LLM unit tests (how well azd surfaces information for human and AI consumption)

See PR #7202 for implementation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add CLI evaluation and testing framework for LLM and human usability #7608

Problem

Proposal

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Add CLI evaluation and testing framework for LLM and human usability #7608

Description

Problem

Proposal

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions