You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The one-way-door keyword safety net misclassifies "rotate ... password" as a reversible (two-way) question. The sibling revoke and reset credential patterns both include password, but the rotate pattern omits it, so the most common phrasing of a credential-rotation op ("rotate the database password") slips through the destructive-keyword check.
scripts/one-way-doors.ts is the documented SECONDARY safety layer for AskUserQuestion calls that fire without a registry id (ad-hoc {skill}-{slug} ids). Its header is explicit about which direction is dangerous:
A false negative could mean auto-approving a destructive operation.
The credential patterns are written to be parallel (the inline comment says they "allow filler words ('the', 'my') between verb and noun"), so this is an omission, not an intentional scope choice.
// Credentials / auth — allow filler words ("the", "my") between verb and noun/\brevoke\s+[\w\s]*\b(apikey|token|credential|accesskey|password)\b/i,// has password/\breset\s+[\w\s]*\b(apikey|token|password|credential)\b/i,// has password/\brotate\s+[\w\s]*\b(apikey|token|secret|credential|accesskey)\b/i,// MISSING password
Reproduced directly against the committed module:
$ bun -e "import {classifyQuestion} from './scripts/one-way-doors.ts';
for (const s of ['Rotate the database password?','rotate password for prod','Revoke the password','Reset the password','rotate the API key'])
console.log((classifyQuestion({summary:s}).oneWay?'ONE-WAY':'two-way ')+' '+s)"
two-way Rotate the database password?
two-way rotate password for prod
ONE-WAY Revoke the password
ONE-WAY Reset the password
ONE-WAY rotate the API key
"Rotate the database password" should be the clearest one-way credential op, yet it is the only one of the three verbs that fails to catch password.
Expected behavior
classifyQuestion/isOneWayDoor should treat "rotate ... password" as one-way (reason: 'keyword'), in parity with revoke ... password and reset ... password.
Impact
classifyQuestion/isOneWayDoor is an exported, unit-tested API (test/plan-tune.test.ts covers the keyword net, including 'rotate the API key'). For an ad-hoc destructive question that the keyword net is supposed to catch, a false negative here lets a credential-rotation question be treated as suppressible by a permissive tuning preference (never-ask / ask-only-for-one-way) instead of always asking. This is exactly the failure mode the module's threat model calls out as the dangerous direction.
Duplicate search performed
PRs (open + closed): one-way-doors, DESTRUCTIVE_PATTERNS, rotate password, classifyQuestion, one-way door rotate — no PR touches this pattern.
Add password to the rotate alternation so the three credential verbs are at parity, plus a regression test in the existing one-way-doors classifier block:
Summary
The one-way-door keyword safety net misclassifies "rotate ... password" as a reversible (two-way) question. The sibling
revokeandresetcredential patterns both includepassword, but therotatepattern omits it, so the most common phrasing of a credential-rotation op ("rotate the database password") slips through the destructive-keyword check.scripts/one-way-doors.tsis the documented SECONDARY safety layer for AskUserQuestion calls that fire without a registry id (ad-hoc{skill}-{slug}ids). Its header is explicit about which direction is dangerous:The credential patterns are written to be parallel (the inline comment says they "allow filler words ('the', 'my') between verb and noun"), so this is an omission, not an intentional scope choice.
Current behavior on
main(3bef43b / v1.55.0.0)scripts/one-way-doors.ts:Reproduced directly against the committed module:
"Rotate the database password" should be the clearest one-way credential op, yet it is the only one of the three verbs that fails to catch
password.Expected behavior
classifyQuestion/isOneWayDoorshould treat "rotate ... password" as one-way (reason: 'keyword'), in parity withrevoke ... passwordandreset ... password.Impact
classifyQuestion/isOneWayDooris an exported, unit-tested API (test/plan-tune.test.tscovers the keyword net, including'rotate the API key'). For an ad-hoc destructive question that the keyword net is supposed to catch, a false negative here lets a credential-rotation question be treated as suppressible by a permissive tuning preference (never-ask/ask-only-for-one-way) instead of always asking. This is exactly the failure mode the module's threat model calls out as the dangerous direction.Duplicate search performed
one-way-doors,DESTRUCTIVE_PATTERNS,rotate password,classifyQuestion,one-way door rotate— no PR touches this pattern.one-way door,rotate password,DESTRUCTIVE,classifyQuestion,destructive keyword— the only DESTRUCTIVE hits are about/careful,/guard, and/sync-gbrain(/sync-gbraincan trigger gbrain's destructive auto-recovery, wiping user repos #1734, /careful hook triggers false positives on patterns inside commit messages #1060, [Proposal] (safety): enhance /careful and /guard with structured execution judgment #1091), unrelated to this classifier.scripts/one-way-doors.tsis unchanged since v1.0.0.0 (feat: gstack v1 — simpler prompts + real LOC receipts (v1.0.0.0) #1039).Candidate fix
Add
passwordto therotatealternation so the three credential verbs are at parity, plus a regression test in the existingone-way-doors classifierblock: