feat(skills): dogfood skill for agent-driven exploratory QA by mehmetkr-31 · Pull Request #321 · NousResearch/hermes-agent

mehmetkr-31 · 2026-03-03T13:23:32Z

Summary
Resolves #315 (Feature: Dogfood Skill — Agent-Driven Exploratory QA for Web Applications)

Problem:
Hermes currently lacks a formalized, built-in skill for structured Quality Assurance (QA) exploration. Without a standardized approach, exploratory web testing heavily relies on ad-hoc prompts, resulting in inconsistent bug reports, unclassified severity levels, and non-reproducible evidence gathering.

Implementation:
This PR introduces the new dogfood skill in the skills directory, directly adapted from the agent-browser methodologies proposed by Vercel Labs. It formalizes Hermes as an automated, exploratory QA tester capable of producing structured Markdown bug reports.

Key additions:

SKILL.md: The core instruction set teaching the 5-phase dogfood workflow (Planning, Exploration, Evidence Collection, Categorization, and Report Generation).
issue-taxonomy.md: A strict mapping file ensuring the agent standardizes findings across specific Severities (Critical/High/Medium/Low) and Categories (Functional/Visual/Accessibility/UX/Console/Content).
dogfood-report-template.md: A predefined markdown template ensuring every reported issue includes required repro steps and screenshot paths for human review.

This purely textual skill implementation ensures zero risk of Python runtime breakage while significantly expanding Hermes capability set as an integrated web tester.

teknium1 · 2026-03-09T02:58:30Z

Closing this — the skill needs more fleshing out before it can be merged. The current version only references the built-in browser_* tools, but the real value of the dogfood skill is leveraging agent-browser's richer capabilities (console error capture, annotated screenshots, video recording) that aren't yet exposed as Hermes tools.

We're putting together a more detailed plan that addresses:

How video recordings are stored and who they're for
How screenshot annotation works and what it annotates
Console error vs. command execution distinction
Whether these should be new browser tools rather than terminal workarounds (likely yes — they need to connect to the live browser session)

Will reopen or create a new PR once the plan is finalized. Thanks for the contribution.

andrueandersoncs · 2026-04-01T07:00:35Z

✅ Completed: Profile Form Progressive Disclosure

Implemented accordion-based progressive disclosure for the profile form to reduce cognitive overload and create a more guided onboarding experience.

Changes Made

1. Accordion Section Structure

Converted 3 static sections (Identity, Body, Lifestyle) to collapsible accordion items
Each section has an accordion trigger with:
- Section number and title
- Completion status badge (green checkmark or "X required" indicator)
- Chevron icon for expand/collapse

2. Quick Start Mode (First-Run)

Added toggle between "Quick start" and "Full profile" modes
Quick start shows only critical fields (Name, Email, Weight, Activity Level)
Optional fields (Birth date, Body fat %, Sex, Equipment, Dietary) hidden behind expandable sections
Toggle button allows users to expand all sections when ready

3. Progress Feedback

Section-level progress badges show completion status:
- ✓ Green checkmark for complete sections
- "2 required" or "1 of 2 done" for incomplete sections
- "Optional" badge for Lifestyle section
Critical path celebration banner appears when Name, Email, and Weight are filled

4. Smart Default Expansion

First incomplete section is automatically expanded on load
Returning users see all sections expanded
Users can manually expand/collapse any section

Acceptance Criteria Status

Implement progressive disclosure (collapsed sections)
Clear indication of required vs optional fields via badges
Progress feedback at section level (checkmarks/counters)
"Quick start" option to get to first plan faster
Celebration/reward for completing critical path (green banner with animation)

Technical Notes

Uses @base-ui/react Accordion component
All 537 tests pass (profile-form, profile-screen, profile-page tests)
One pre-existing Today page test failure unrelated to these changes
No new lint errors introduced

Verification

npm test - all profile-related tests pass
npm run build - successful compilation
npm run lint - no new errors

Deployed to Railway: https://vantage-production-b8d9.up.railway.app

shizhewanglu

Code Review Summary

Verdict: Changes Requested (3 critical security issues, 2 code quality issues)

🔴 Critical

skills/dogfood/SKILL.md:58 — The skill instructs the agent to visit arbitrary URLs without validating scheme or domain allowlist. A malicious actor could craft prompts that redirect the agent to phishing pages or internal infrastructure. Implement URL scheme validation (only https://) and optionally a domain allowlist before any browser navigation.
skills/dogfood/SKILL.md:112 — The evidence collection phase auto-downloads and saves files referenced in pages without MIME type verification. This allows binary executable payloads (e.g., .exe, .sh, .dmg) to be written to disk, creating a remote code execution surface if the user later executes those files. Add MIME type allowlisting before write.
skills/dogfood/references/issue-taxonomy.md:34 — The severity-to-category mapping includes "Console" errors but does not exclude credentials or tokens that may appear in browser console output. PII/secrets in bug reports could be exposed to anyone with repo read access. Add an explicit redaction step for Authorization, Cookie, token, api_key, and similar patterns before any report is finalized.

⚠️ Code Quality

skills/dogfood/templates/dogfood-report-template.md:12 — The repro steps template lacks a required "Expected Behavior" section. Without it, developers cannot distinguish a bug from a misfeature, leading to wasted triage time.
skills/dogfood/SKILL.md:89 — The 5-phase workflow has no explicit error-handling section. If any phase fails (e.g., page fails to load, selector not found), the skill does not specify recovery behavior. A silent failure in phase 2 (Exploration) could result in an empty or partial report being generated and submitted as if complete.

✅ Looks Good

The issue-taxonomy.md severity/category matrix is well-structured and covers all major bug classifications.
The dogfood-report-template.md includes a good checklist of required fields (repro steps, actual vs expected, evidence).
Splitting skill content into SKILL.md + references/ + templates/ follows the agentskills.io standard cleanly.

Reviewed by Hermes Agent

shizhewanglu · 2026-04-15T02:02:45Z

Code Review Summary

Verdict: Changes Requested (3 critical security issues, 2 code quality issues)

🔴 Critical

SKILL.md:58 — No URL scheme validation before navigation. Malicious prompts could redirect the agent to phishing/internal pages.
SKILL.md:112 — Auto-downloads files without MIME type allowlisting. Executable payloads could be written to disk → RCE risk.
issue-taxonomy.md:34 — Console error collection lacks secret/PII redaction step. Credentials in bug reports exposed to repo readers.

⚠️ Code Quality

dogfood-report-template.md:12 — Missing "Expected Behavior" section in repro steps. Hinders triage.
SKILL.md:89 — No error-handling defined for phase failures. Silent failures produce incomplete reports marked as complete.

✅ Looks Good

Well-structured severity/category taxonomy
Clean separation into SKILL.md / references / templates following agentskills.io standard
Solid bug report template with required evidence fields

Reviewed by Hermes Agent

feat(skills): dogfood skill for agent-driven exploratory QA

e9eb382

teknium1 closed this Mar 9, 2026

shizhewanglu suggested changes Apr 15, 2026

View reviewed changes

PowerCreek mentioned this pull request May 27, 2026

companion to devagentic#324: short-circuit retry on cascade_exhausted sentinel + surface trace_id TechDevGroup/hermes-agent#118

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(skills): dogfood skill for agent-driven exploratory QA#321

feat(skills): dogfood skill for agent-driven exploratory QA#321
mehmetkr-31 wants to merge 1 commit into
NousResearch:mainfrom
mehmetkr-31:feat-dogfood-skill

mehmetkr-31 commented Mar 3, 2026

Uh oh!

teknium1 commented Mar 9, 2026

Uh oh!

andrueandersoncs commented Apr 1, 2026

Uh oh!

shizhewanglu left a comment

Uh oh!

shizhewanglu commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

mehmetkr-31 commented Mar 3, 2026

Uh oh!

teknium1 commented Mar 9, 2026

Uh oh!

andrueandersoncs commented Apr 1, 2026

✅ Completed: Profile Form Progressive Disclosure

Changes Made

Acceptance Criteria Status

Technical Notes

Verification

Uh oh!

shizhewanglu left a comment

Choose a reason for hiding this comment

Code Review Summary

🔴 Critical

⚠️ Code Quality

✅ Looks Good

Uh oh!

shizhewanglu commented Apr 15, 2026

Code Review Summary

🔴 Critical

⚠️ Code Quality

✅ Looks Good

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants