Skip to content

Commit 46c8864

Browse files
committed
revert(qa-lab): remove scenario github traceability metadata
1 parent 23c5808 commit 46c8864

9 files changed

Lines changed: 2 additions & 87 deletions

File tree

CHANGELOG.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,6 @@ Docs: https://docs.openclaw.ai
1212
- QA-Lab: add curated mock JSONL replay fixtures and first-drift reporting for runtime-parity audits. (#80323, refs #80176) Thanks @100yenadmin.
1313
- QA-Lab: include the optional 100-turn runtime parity soak in release-soak artifacts so long-run Codex/Pi transcript drift stays visible outside the default gate. (#80395) Thanks @100yenadmin.
1414
- QA-Lab: add a personal-agent failure recovery scenario that checks honest partial status, retry boundaries, and local recovery artifacts. (#83872) Thanks @iFiras-Max1.
15-
- QA-Lab: add GitHub issue evidence metadata to audited runtime scenarios so parity and tool-fixture coverage links back to the source threads.
1615
- QA-Lab: include an opt-in `update.run` package self-upgrade sentinel for destructive latest-package recovery checks.
1716
- Tests/perf: isolate doctor core health check unit coverage from real skills/workspace discovery so `doctor-core-checks` no longer dominates unit perf while keeping one real skills-readiness smoke. (#84493) Thanks @frankekn.
1817

extensions/qa-lab/src/scenario-catalog.test.ts

Lines changed: 0 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -108,39 +108,14 @@ describe("qa scenario catalog", () => {
108108
const soak = readQaScenarioById("runtime-soak-100-turn");
109109

110110
expect(firstHour.runtimeParityTier).toBe("standard");
111-
expect(firstHour.evidence?.github).toContain(
112-
"https://github.com/openclaw/openclaw/issues/80364",
113-
);
114111
expect(readQaScenarioExecutionConfig(firstHour.id)).toMatchObject({
115112
runtimeParityComparison: "outcome-only",
116113
turnCount: 20,
117114
});
118115
expect(soak.runtimeParityTier).toBe("soak");
119-
expect(soak.evidence?.github).toContain(
120-
"https://github.com/openclaw/openclaw/issues/80395",
121-
);
122116
expect(readQaScenarioExecutionConfig(soak.id)).toMatchObject({ turnCount: 100 });
123117
});
124118

125-
it("loads audited GitHub evidence metadata from scenario markdown", () => {
126-
const pack = readQaScenarioPack();
127-
const scenariosWithEvidence = pack.scenarios.filter(
128-
(scenario) => (scenario.evidence?.github?.length ?? 0) > 0,
129-
);
130-
const evidenceUrls = scenariosWithEvidence.flatMap(
131-
(scenario) => scenario.evidence?.github ?? [],
132-
);
133-
134-
expect(scenariosWithEvidence.map((scenario) => scenario.id)).toContain(
135-
"codex-pi-shaped-read-vocabulary",
136-
);
137-
expect(evidenceUrls).toContain("https://github.com/openclaw/openclaw/pull/80323");
138-
expect(evidenceUrls).toContain("https://github.com/openclaw/openclaw/issues/80312");
139-
for (const url of evidenceUrls) {
140-
expect(url).toMatch(/^https:\/\/github\.com\/openclaw\/openclaw\/(?:issues|pull)\/\d+$/);
141-
}
142-
});
143-
144119
it("loads runtime tool fixture metadata for standard and optional lanes", () => {
145120
const applyPatch = readQaScenarioById("runtime-tool-apply-patch");
146121
const messageTool = readQaScenarioById("runtime-tool-message-tool");

extensions/qa-lab/src/scenario-catalog.ts

Lines changed: 0 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -93,41 +93,6 @@ const qaScenarioGatewayRuntimeSchema = z.object({
9393
forwardHostHome: z.boolean().optional(),
9494
});
9595

96-
function isOpenClawGitHubIssueOrPullUrl(value: string): boolean {
97-
try {
98-
const parsed = new URL(value);
99-
return (
100-
parsed.hostname === "github.com" &&
101-
/^\/openclaw\/openclaw\/(?:issues|pull)\/[1-9]\d*$/.test(parsed.pathname)
102-
);
103-
} catch {
104-
return false;
105-
}
106-
}
107-
108-
const qaScenarioEvidenceGithubUrlSchema = z
109-
.string()
110-
.trim()
111-
.url()
112-
.refine(isOpenClawGitHubIssueOrPullUrl, {
113-
message: "evidence.github entries must be openclaw/openclaw issue or PR URLs",
114-
});
115-
116-
const qaScenarioEvidenceSchema = z
117-
.object({
118-
github: z.array(qaScenarioEvidenceGithubUrlSchema).min(1).optional(),
119-
})
120-
.superRefine((evidence, ctx) => {
121-
if (evidence.github?.length) {
122-
return;
123-
}
124-
ctx.addIssue({
125-
code: z.ZodIssueCode.custom,
126-
path: ["github"],
127-
message: "evidence.github must include at least one URL",
128-
});
129-
});
130-
13196
export const QA_RUNTIME_PARITY_TIERS = ["standard", "optional", "live-only", "soak"] as const;
13297
const qaRuntimeParityTierSchema = z.enum(QA_RUNTIME_PARITY_TIERS);
13398

@@ -216,7 +181,6 @@ const qaSeedScenarioSchema = z.object({
216181
category: z.string().trim().min(1).optional(),
217182
runtimeParityTier: qaRuntimeParityTierSchema.optional(),
218183
coverage: qaScenarioCoverageSchema.optional(),
219-
evidence: qaScenarioEvidenceSchema.optional(),
220184
surfaces: z.array(z.string().trim().min(1)).min(1).optional(),
221185
risk: z.enum(["low", "medium", "high"]).optional(),
222186
capabilities: z.array(z.string().trim().min(1)).optional(),

qa/scenarios/index.md

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,8 @@ Single source of truth for repo-backed QA suite bootstrap data.
55

66
- `index.md` defines pack-level bootstrap data
77
- each nested `*.md` scenario defines one runnable test via `qa-scenario` + `qa-flow`
8-
- scenario markdown may also define coverage IDs, evidence links, category metadata,
9-
required plugins, lane filters, runtime parity tiers, and gateway config patching
8+
- scenario markdown may also define coverage IDs, category metadata, required plugins,
9+
lane filters, runtime parity tiers, and gateway config patching
1010

1111
- kickoff mission
1212
- QA operator identity
@@ -20,9 +20,6 @@ Coverage tracking:
2020
- prefer reusing an existing feature ID over minting a scenario-shaped ID
2121
- avoid copying the scenario title into coverage IDs
2222
- use `pnpm openclaw qa coverage` to render the current inventory
23-
- use `evidence.github` for full `https://github.com/openclaw/openclaw/issues/<n>` or
24-
`https://github.com/openclaw/openclaw/pull/<n>` links when a scenario directly protects
25-
a reported regression, RFC, or accepted PR behavior
2623
- use `runtimeParityTier` for runtime-pair gate membership: `standard`,
2724
`optional`, `live-only`, or `soak`
2825
- treat the old `coverage: ["id"]` / `coverage: - id` list shape as invalid

qa/scenarios/runtime/codex-pi-shaped-read-vocabulary.md

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -11,10 +11,6 @@ coverage:
1111
secondary:
1212
- runtime.prompt-compatibility
1313
- tools.fs.read
14-
evidence:
15-
github:
16-
- https://github.com/openclaw/openclaw/pull/80323
17-
- https://github.com/openclaw/openclaw/issues/81734
1814
objective: Verify Codex-mode agents can satisfy legacy Pi-shaped "Read tool" wording through the native Codex workspace-read capability instead of stopping because duplicate OpenClaw dynamic read is intentionally filtered.
1915
successCriteria:
2016
- Agent reads the seeded workspace file and replies with the exact marker line.

qa/scenarios/runtime/first-hour-20-turn.md

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,6 @@ coverage:
1010
- runtime.first-hour-20
1111
secondary:
1212
- runtime.long-context
13-
evidence:
14-
github:
15-
- https://github.com/openclaw/openclaw/issues/80171
16-
- https://github.com/openclaw/openclaw/issues/80337
17-
- https://github.com/openclaw/openclaw/issues/80364
1813
objective: Verify both runtimes preserve a same-session conversation across the required 20-turn maintainer gate.
1914
successCriteria:
2015
- The same QA session accepts 20 sequential user turns.

qa/scenarios/runtime/soak-100-turn.md

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,6 @@ coverage:
1010
- runtime.soak-100
1111
secondary:
1212
- runtime.long-context
13-
evidence:
14-
github:
15-
- https://github.com/openclaw/openclaw/issues/80171
16-
- https://github.com/openclaw/openclaw/issues/80338
17-
- https://github.com/openclaw/openclaw/issues/80395
1813
objective: Provide an optional long-run soak that can be scheduled or run in Testbox without entering the maintainer default gate.
1914
successCriteria:
2015
- The same QA session accepts 100 sequential user turns.

qa/scenarios/runtime/tools/apply-patch.md

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,6 @@ runtimeParityTier: standard
88
coverage:
99
primary:
1010
- tools.apply-patch
11-
evidence:
12-
github:
13-
- https://github.com/openclaw/openclaw/issues/80320
1411
objective: Verify apply_patch behavior is tracked across Pi and Codex while Codex owns patching natively.
1512
successCriteria:
1613
- Pi may expose OpenClaw apply_patch while Codex app-server mode may omit duplicate OpenClaw dynamic apply_patch.

qa/scenarios/runtime/tools/fs-read.md

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,6 @@ runtimeParityTier: standard
88
coverage:
99
primary:
1010
- tools.fs.read
11-
evidence:
12-
github:
13-
- https://github.com/openclaw/openclaw/issues/80312
1411
objective: Verify file read behavior is tracked across Pi and Codex while Codex owns read natively.
1512
successCriteria:
1613
- Pi may expose OpenClaw read while Codex app-server mode may omit duplicate OpenClaw dynamic read.

0 commit comments

Comments
 (0)