Skip to content

Latest commit

 

History

History
119 lines (109 loc) · 5.63 KB

File metadata and controls

119 lines (109 loc) · 5.63 KB
description Smoke test workflow that validates Codex engine functionality by reviewing recent PRs twice daily
true
schedule workflow_dispatch pull_request reaction status-comment
every 12h
types names
labeled
smoke
hooray
true
permissions
contents issues pull-requests
read
read
read
name Smoke Codex
engine codex
strict false
imports
shared/gh.md
shared/reporting.md
shared/mcp/serena-go.md
shared/observability-otlp.md
network
allowed
defaults
github
playwright
tools
cache-memory github playwright edit bash web-fetch
true
*
runtimes
go
version
1.25
safe-outputs
allowed-domains add-comment create-issue add-labels remove-labels unassign-from-user hide-comment messages actions
default-safe-outputs
hide-older-comments max
true
2
expires close-older-issues close-older-key labels
2h
true
smoke-codex
automation
testing
allowed
smoke-codex
allowed
smoke
allowed max
githubactionagent
1
footer run-started run-success run-failure
> 🔮 *The oracle has spoken through [{workflow_name}]({run_url})*{effective_tokens_suffix}{history_link}
🔮 The ancient spirits stir... [{workflow_name}]({run_url}) awakens to divine this {event_type}...
✨ The prophecy is fulfilled... [{workflow_name}]({run_url}) has completed its mystical journey. The stars align. 🌟
🌑 The shadows whisper... [{workflow_name}]({run_url}) {status}. The oracle requires further meditation...
add-smoked-label
uses description env
actions-ecosystem/action-add-labels@v1.1.3
Add the 'smoked' label to the current pull request
GITHUB_TOKEN
${{ github.token }}
timeout-minutes 15
checkout
fetch-depth current
2
true

Smoke Test: Codex Engine Validation

CRITICAL EFFICIENCY REQUIREMENTS:

  • Keep ALL outputs extremely short and concise. Use single-line responses.
  • NO verbose explanations or unnecessary context.
  • Minimize file reading - only read what is absolutely necessary for the task.
  • Use targeted, specific queries - avoid broad searches or large data retrievals.

Test Requirements

  1. GitHub MCP Testing: Use GitHub MCP tools to fetch details of exactly 2 merged pull requests from ${{ github.repository }} (title and number only, no descriptions)
  2. Serena MCP Testing:
    • Use the Serena MCP server tool activate_project to initialize the workspace at ${{ github.workspace }} and verify it succeeds (do NOT use bash to run go commands)
    • After initialization, use the find_symbol tool to search for symbols and verify that at least 3 symbols are found in the results
  3. Playwright Testing: Use the playwright tools to navigate to https://github.com and verify the page title contains "GitHub" (do NOT try to install playwright - use the provided MCP tools)
  4. Web Fetch Testing: Use the web-fetch MCP tool to fetch https://github.com and verify the response contains "GitHub" (do NOT use bash or playwright for this test - use the web-fetch MCP tool directly)
  5. File Writing Testing: Create a test file /tmp/gh-aw/agent/smoke-test-codex-${{ github.run_id }}.txt with content "Smoke test passed for Codex at $(date)" (create the directory if it doesn't exist)
  6. Bash Tool Testing: Execute bash commands to verify file creation was successful (use cat to read the file back)
  7. Build gh-aw: Run GOCACHE=/tmp/go-cache GOMODCACHE=/tmp/go-mod make build to verify the agent can successfully build the gh-aw project (both caches must be set to /tmp because the default cache locations are not writable). If the command fails, mark this test as ❌ and report the failure.

Output

ALWAYS create an issue with a summary of the smoke test run:

  • Title: "Smoke Test: Codex - ${{ github.run_id }}"
  • Body should include:
    • Test results (✅ or ❌ for each test)
    • Overall status: PASS or FAIL
    • Run URL: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
    • Timestamp

Only if this workflow was triggered by a pull_request event: Additionally use the add_comment safe-output tool to add a very brief comment (max 5-10 lines) to the triggering pull request, specifying item_number: ${{ github.event.pull_request.number }} (use this exact number — do NOT search GitHub for a PR):

  • PR titles only (no descriptions)
  • ✅ or ❌ for each test result
  • Overall status: PASS or FAIL

If all tests pass and this workflow was triggered by a pull_request event:

  • Use the add_labels safe-output tool to add the label smoke-codex to the pull request (use item_number: ${{ github.event.pull_request.number }})
  • Use the remove_labels safe-output tool to remove the label smoke from the pull request (use item_number: ${{ github.event.pull_request.number }})
  • Use the unassign_from_user safe-output tool to unassign the user githubactionagent from the pull request (this is a fictitious user used for testing; use item_number: ${{ github.event.pull_request.number }})
  • Use the add_smoked_label safe-output action tool to add the label smoked to the pull request (call it with {"labels": "smoked", "number": "${{ github.event.pull_request.number }}"})

Important: If no action is needed after completing your analysis, you MUST call the noop safe-output tool with a brief explanation. Failing to call any safe-output tool is the most common cause of safe-output workflow failures.

{"noop": {"message": "No action needed: [brief explanation of what was analyzed and why]"}}