Skip to content

fix: stabilize Matrix tool progress QA#78179

Merged
Patrick-Erichsen merged 12 commits intomainfrom
pe/matrix-tool-progress-flake
May 6, 2026
Merged

fix: stabilize Matrix tool progress QA#78179
Patrick-Erichsen merged 12 commits intomainfrom
pe/matrix-tool-progress-flake

Conversation

@Patrick-Erichsen
Copy link
Copy Markdown
Contributor

@Patrick-Erichsen Patrick-Erichsen commented May 6, 2026

Summary:

  • Stabilize Matrix tool-progress-error QA preview matching for shortened/backtick progress previews.
  • Stabilize Matrix target=both approvals by retrying transient native approval DM preparation and send paths.
  • Make the target=both QA assertion follow the actual active strict Matrix driver/SUT DM room after direct-room repair, while still requiring channel and DM metadata.

Real behavior proof

Verification:

  • pnpm docs:list
  • pnpm install
  • pnpm exec oxfmt --check --threads=1 extensions/qa-matrix/src/runners/contract/scenario-runtime-approval.ts extensions/qa-matrix/src/runners/contract/scenarios.test.ts extensions/matrix/src/approval-handler.runtime.ts extensions/matrix/src/approval-handler.runtime.test.ts
  • pnpm test extensions/qa-matrix/src/runners/contract/scenarios.test.ts extensions/matrix/src/approval-handler.runtime.test.ts
  • pnpm check:test-types
  • QA-Lab - All Lanes x3 passed before the clean rebase; current-head QA-Lab x2 is running as follow-up proof.

@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented May 6, 2026

Codex review: needs changes before merge.

Summary
The PR updates Matrix approval delivery retries and Matrix QA scenario matching/waits for tool-progress, target=both approvals, image generation, and E2EE trust propagation, with regression tests.

Reproducibility: yes. source-level: current main can reject shortened bare-backtick Matrix tool-progress-error previews because that scenario lacks the generic preview-line fallback. I did not run the live Matrix lane locally in this read-only review.

Real behavior proof
Sufficient (linked_artifact): The PR body links completed QA-Lab live runs showing Matrix transport/media/e2ee and companion live lanes passing after the runtime fix; the latest-head proof check is green, though a manually dispatched current-head QA-Lab follow-up is still running.

Next step before merge
A narrow automated repair can add the missing changelog entry without touching runtime or test logic; maintainer review remains after that.

Security
Cleared: No concrete security or supply-chain issue found; the diff is limited to Matrix approval runtime, Matrix/QA runtime, and tests without dependency, workflow, secret, package-resolution, install, or release-path changes.

Review findings

  • [P3] Add the required changelog entry — extensions/matrix/src/approval-handler.runtime.ts:183
Review details

Best possible solution:

Add the required Matrix changelog entry, let current-head QA-Lab and maintainer handling finish, then land if checks stay green.

Do we have a high-confidence way to reproduce the issue?

Yes, source-level: current main can reject shortened bare-backtick Matrix tool-progress-error previews because that scenario lacks the generic preview-line fallback. I did not run the live Matrix lane locally in this read-only review.

Is this the best way to solve the issue?

Mostly yes: the PR keeps the matcher, approval wait, and retry changes narrow to Matrix/QA behavior, and the earlier prepareTarget contract issue is fixed. The remaining safer path is a small changelog-only repair plus maintainer review before merge.

Full review comments:

  • [P3] Add the required changelog entry — extensions/matrix/src/approval-handler.runtime.ts:183
    This PR changes production Matrix approval delivery by retrying direct-room repair and send paths, but it does not touch the root or Matrix changelog. Repo policy requires a one-line user-facing fix entry before merge.
    Confidence: 0.86

Overall correctness: patch is correct
Overall confidence: 0.84

Acceptance criteria:

  • git diff --check
  • pnpm docs:list

What I checked:

  • Current main source-level reproduction: Current main only accepts bullet-style generic Matrix tool-progress preview lines, and the Matrix tool-progress-error scenario does not enable the generic-line fallback, so shortened bare-backtick progress previews can be rejected before the final edit arrives. (extensions/qa-matrix/src/runners/contract/scenario-runtime-room.ts:694, 627b0073f2f7)
  • PR implementation direction: The PR head accepts bare-backtick Matrix progress preview lines and enables allowGenericProgressLine for matrix-room-tool-progress-error. (extensions/qa-matrix/src/runners/contract/scenario-runtime-room.ts:694, d74e73f70679)
  • Runtime approval retry change: The PR adds retryMatrixApprovalDelivery and routes Matrix approval DM repair and send paths through it, which is production Matrix approval behavior rather than test-only code. (extensions/matrix/src/approval-handler.runtime.ts:183, d74e73f70679)
  • Previous review finding fixed: The prepareTarget regression test now includes view: buildExecApprovalView(), satisfying the approval runtime contract called out by the earlier ClawSweeper review. (extensions/matrix/src/approval-handler.runtime.test.ts:388, d74e73f70679)
  • Missing changelog: The live PR file list contains no root or Matrix changelog file even though production Matrix approval delivery behavior changes. (d74e73f70679)
  • CI and proof state: Latest-head CI, check-test-types, check-docs, CodeQL Critical Quality, and Real behavior proof checks are successful; linked QA-Lab runs at 0345cbc passed Matrix transport/media/e2ee lanes, while current-head manually dispatched QA-Lab follow-up run 25419191833 still has Matrix transport in progress and run 25419191835 is pending. (d74e73f70679)

Likely related people:

  • vincentkoc: Recent main-branch commits adjusted Matrix tool-progress matching and target=both approval behavior in the same QA runner/test area this PR changes. (role: recent maintainer; confidence: high; commits: 8e79392dccf4, 36bcf88ffc3a, fcfb6500da10; files: extensions/qa-matrix/src/runners/contract/scenario-runtime-room.ts, extensions/qa-matrix/src/runners/contract/scenario-runtime-approval.ts, extensions/qa-matrix/src/runners/contract/scenarios.test.ts)
  • gumadeiras: History shows substantial Matrix QA contract, streaming tool-progress, approval metadata, and native approval lifecycle work that this PR builds on. (role: introduced behavior; confidence: high; commits: 988447ca240c, b9fd13e8d730, ae616777f3b1; files: extensions/qa-matrix/src/runners/contract/scenario-runtime-room.ts, extensions/qa-matrix/src/runners/contract/scenario-runtime-approval.ts, extensions/qa-matrix/src/runners/contract/scenarios.test.ts)
  • steipete: Recent history includes Matrix QA profile/runtime work and a bundled Matrix approval-handler refactor near this surface. (role: adjacent maintainer; confidence: medium; commits: 05eda57b3c72, 6987132aed84, 855912b25cb8; files: extensions/matrix/src/approval-handler.runtime.ts, extensions/qa-matrix/src/runners/contract/scenario-runtime-room.ts, extensions/qa-matrix/src/runners/contract/scenarios.test.ts)

Remaining risk / open question:

  • The PR has the protected maintainer label, so merge/cleanup still needs explicit maintainer handling.
  • The current-head manually dispatched QA-Lab follow-up has not fully completed yet, although completed live proof exists for the pre-rebase head and latest-head CI/proof checks are green.
  • A required user-facing changelog entry is missing for the production Matrix approval delivery retry.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 627b0073f2f7.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

channel: matrix Channel integration: matrix extensions: qa-lab maintainer Maintainer-authored PR proof: sufficient ClawSweeper judged the real behavior proof convincing. size: L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant