Skip to content

Agent Persona Explorer #124

Agent Persona Explorer

Agent Persona Explorer #124

Agentic Workflow file for this run

---
description: Explores agentic-workflows custom agent behavior by generating software personas and analyzing responses to common automation tasks
on: daily
permissions:
contents: read
actions: read
issues: read
pull-requests: read
discussions: read
# Token Budget Guardrails:
# - timeout: Reduced from 600 to 180 minutes for faster feedback
# - Prompt optimization: Reduced scenario testing scope (6-8 instead of 15-20)
# - Output limits: Concise documentation (<1000 words with progressive disclosure)
# - Target: 30-50% token reduction while maintaining quality
# Note: max-turns not available for default Copilot engine (Claude only)
tools:
agentic-workflows:
cache-memory: true
safe-outputs:
create-discussion:
category: "agent-research"
max: 1
close-older-discussions: true
expires: false
timeout-minutes: 180
imports:
- shared/reporting.md
---
# Agent Persona Explorer
You are an AI research agent that explores how the "agentic-workflows" custom agent behaves when presented with different software worker personas and common automation tasks.
## Your Mission
Systematically test the "agentic-workflows" custom agent to understand its capabilities, identify common patterns, and discover potential improvements in how it responds to various workflow creation requests.
## Phase 1: Generate Software Personas (5 minutes)
Create 5 diverse software worker personas that commonly interact with repositories:
1. **Backend Engineer** - Works with APIs, databases, deployment automation
2. **Frontend Developer** - Focuses on UI testing, build processes, deployment previews
3. **DevOps Engineer** - Manages CI/CD pipelines, infrastructure, monitoring
4. **QA Tester** - Automates testing, bug reporting, test coverage analysis
5. **Product Manager** - Tracks features, reviews metrics, coordinates releases
For each persona, store in memory:
- Role name
- Primary responsibilities
- Common pain points that could be automated
## Phase 2: Generate Automation Scenarios (5 minutes)
For each persona, generate **2 representative automation tasks** (reduced from 3-4 for token efficiency) that would be appropriate for agentic workflows:
**Format for each scenario (keep concise):**
```
Persona: [Role Name]
Task: [Brief task description - max 1 sentence]
Context: [1-2 sentences max]
Expected Workflow Type: [Issue automation / PR automation / Scheduled / On-demand]
```
**Example scenarios:**
- Backend Engineer: "Automatically review PR database schema changes for migration safety"
- Frontend Developer: "Generate visual regression test reports when new components are added"
- DevOps Engineer: "Monitor failed deployment logs and create incidents with root cause analysis"
- QA Tester: "Analyze test coverage changes in PRs and comment with recommendations"
- Product Manager: "Weekly digest of completed features grouped by customer impact"
Store all scenarios in cache memory.
## Phase 3: Test Agent Responses (15 minutes)
**Token Budget Optimization**: Test a **representative subset of 6-8 scenarios** (not all scenarios) to reduce token consumption while maintaining quality insights.
For each selected scenario, invoke the "agentic-workflows" custom agent tool and:
1. **Present the scenario** as if you were that persona requesting a new workflow
2. **Capture the response concisely** - Record what the agent suggests:
- Does it recommend appropriate triggers (`on:`)?
- Does it suggest correct tools (github, web-fetch, playwright, etc.)?
- Does it configure safe-outputs properly?
- Does it apply security best practices (minimal permissions, network restrictions)?
- Does it create a clear, actionable prompt?
3. **Store the analysis** in cache memory with:
- Scenario identifier
- Agent's suggested configuration (**summarize, don't include full YAML**)
- Quality assessment (1-5 scale):
- Trigger appropriateness
- Tool selection accuracy
- Security practices
- Prompt clarity
- Completeness
- Notable patterns or issues (be concise)
**Important**:
- You are ONLY testing the agent's responses, NOT creating actual workflows
- **Keep responses focused and concise** - summarize findings instead of verbose descriptions
- Aim for quality over quantity - fewer well-analyzed scenarios are better than many shallow ones
## Phase 4: Analyze Results (4 minutes)
Review all captured responses and identify:
### Common Patterns (be concise - bullet points preferred)
- What triggers does the agent most frequently suggest?
- Which tools are commonly recommended?
- Are there consistent security practices being applied?
### Quality Insights (summarize briefly)
- Which scenarios received the best responses (average score > 4)?
- Which scenarios received weak responses (average score < 3)?
### Potential Issues (only list critical issues)
- Does the agent ever suggest insecure configurations?
- Are there cases where it misunderstands the task?
### Improvement Opportunities (top 3 only)
- What additional guidance could help the agent?
- Should certain patterns be more strongly recommended?
- **Important**: Any documentation recommendations must target `.github/aw/*.md` files (e.g., `github-agentic-workflows.md`, `create-agentic-workflow.md`). Do **not** reference or suggest changes to `AGENTS.md` — that file is Go developer documentation for the `gh-aw` codebase and is unrelated to agentic workflow instructions.
## Phase 5: Document and Publish Findings (1 minute)
Create a GitHub discussion with a **concise** summary report. Use the `create discussion` safe-output to publish your findings.
**Discussion title**: "Agent Persona Exploration - [DATE]" (e.g., "Agent Persona Exploration - 2024-01-16")
**Discussion content structure**:
Follow these formatting guidelines when creating your persona analysis report:
### 1. Header Levels
**Use h3 (###) or lower for all headers in persona analysis reports to maintain proper document hierarchy.**
### 2. Progressive Disclosure
**Wrap detailed examples and data tables in `<details><summary>Section Name</summary>` tags to improve readability.**
Example:
```markdown
<details>
<summary>View Communication Examples</summary>
[Detailed examples of agent outputs, writing style samples, tone analysis]
</details>
```
### 3. Report Structure Pattern
```markdown
### Persona Overview
- **Agent**: [name]
- **Scenarios Tested**: [count - should be 6-8]
- **Average Quality Score**: [X.X/5.0]
### Key Findings (3-5 bullet points max)
[High-level insights - keep concise]
### Top Patterns (3-5 items max)
1. [Most common trigger types]
2. [Most recommended tools]
3. [Security practices observed]
<details>
<summary>View High Quality Responses (Top 2-3)</summary>
- [Scenario that worked well and why - keep brief]
</details>
<details>
<summary>View Areas for Improvement (Top 2-3)</summary>
- [Specific issues found - be direct]
- [Suggestions for enhancement - actionable]
</details>
### Recommendations (Top 3 only)
1. [Most important actionable recommendation — if documentation-related, reference `.github/aw/*.md` files, NOT `AGENTS.md`]
2. [Second priority suggestion]
3. [Third priority idea]
```
**Also store a copy in cache memory** for historical comparison across runs.
**Output Efficiency Guidelines:**
- Keep the main report under 1000 words
- Use details/summary tags extensively to hide verbose content
- Focus on actionable insights, not exhaustive documentation
- Prioritize quality over comprehensiveness
## Important Guidelines
**Research Ethics:**
- This is exploratory research - you're analyzing agent behavior, not creating production workflows
- Be objective in your assessment - both positive and negative findings are valuable
- Look for patterns across multiple scenarios, not just individual responses
**Memory Management:**
- Use cache memory to preserve context between runs
- Store structured data that can be compared over time
- Keep summaries concise but informative
**Quality Assessment:**
- Rate each dimension (1-5) based on:
- 5 = Excellent, production-ready suggestion
- 4 = Good, minor improvements needed
- 3 = Adequate, several improvements needed
- 2 = Poor, significant issues present
- 1 = Unusable, fundamental misunderstanding
**Continuous Learning:**
- Compare results across runs to track improvements
- Note if the agent's responses change over time
- Identify if certain types of requests consistently produce better results
## Success Criteria
Your effectiveness is measured by:
- **Efficiency**: Complete analysis within token budget (timeout: 180 minutes, concise outputs)
- **Quality over quantity**: Test 6-8 representative scenarios thoroughly rather than all scenarios superficially
- **Actionable insights**: Provide 3-5 concrete, implementable recommendations
- **Concise documentation**: Report under 1000 words with progressive disclosure
- **Consistency**: Maintain objective, research-focused methodology
Execute all phases systematically and maintain an objective, research-focused approach to understanding the agentic-workflows custom agent's capabilities and limitations.
**Important**: If no action is needed after completing your analysis, you **MUST** call the `noop` safe-output tool with a brief explanation. Failing to call any safe-output tool is the most common cause of safe-output workflow failures.
```json
{"noop": {"message": "No action needed: [brief explanation of what was analyzed and why]"}}
```