Skip to content

[Task] Clean up release-gate E2E test debt from v2026.5.11 #529

@Astro-Han

Description

@Astro-Han

Goal

Clean up the E2E test debt found during the v2026.5.11 release gate so future release checks distinguish real regressions from stale test assumptions.

Scope

In scope:

  • Update prompt-mention coverage so it searches files that exist in the active test workspace instead of assuming the repo checkout is the active project.
  • Fix session-composer-dock tests that use invalid question seed data or hit nested-scroll/streaming boundaries instead of the intended user contract.
  • Fix the release-notes zh setup test so LANGUAGE_KEY is available inside the browser addInitScript context.
  • Update or retire stale session header-menu tests now that sidebar rename/delete coverage owns the current user path.
  • Update the home model-chip width assertion if the intrinsic chip width is the intended design.

Out of scope:

  • Do not change product behavior unless a test cleanup proves a real regression.
  • Do not fold this into the v2026.5.11 release chore.
  • Do not broaden into a full E2E suite rewrite.

Relevant files or context

Likely files:

  • packages/app/e2e/prompt/prompt-mention.spec.ts
  • packages/app/e2e/session/session-composer-dock.spec.ts
  • packages/app/e2e/release-notes/release-notes-toast.spec.ts
  • packages/app/e2e/session/session.spec.ts
  • packages/app/e2e/app/home.spec.ts

Release-gate findings:

  • prompt-mention: the failing test searched for packages/app/package.json, but the browser was pointed at /Users/yuhan/PawWork; the smaller active-project @mention smoke in prompt-slash-open.spec.ts passed.
  • session-composer-dock: one failure hit nested scroll / ongoing rendering behavior rather than a stable long-session user path; another used a one-option question seed rejected by the schema.
  • release-notes: the zh test referenced LANGUAGE_KEY inside browser context without passing it into addInitScript.
  • session.spec: header-menu rename/archive/delete paths are stale after the current sidebar IA; sidebar rename smoke passes.
  • home.spec: the model chip now uses intrinsic sizing instead of the old fixed 176px assertion.
  • home.spec: the root route smoke still expects session-new-home, but / currently renders the no-project empty state when no project is seeded. Triage whether this is fixture setup debt, intended IA drift, or a real regression before changing product behavior.
  • Windows advisory non-E2E note: unit-windows-opencode-session timed out in cancel records MessageAbortedError on interrupted process during PR test: replace hash scroll source snapshot #532 validation; unit-windows-app passed, so this did not block [Task] Replace use-session-hash-scroll source snapshot with behavior test #528 but should be tracked with release-gate debt if it recurs.
  • Windows advisory opencode workspace-adaptor flake: packages/opencode/test/plugin/workspace-adaptor.test.ts:458 (plugin.workspace > cold-start Workspace.get retries sync after owner bootstrap for remote adaptors) failed on Windows after PR feat(ui): render release notes toast with markdown and visible scroll #538 (608068369). Recent windows-advisory history showed 6 failures in 10 runs across unrelated commits, so treat it as timing-sensitive advisory debt rather than a feat(ui): render release notes toast with markdown and visible scroll #538 regression; stabilize retries/timeouts or skip on Windows in a scoped follow-up.

Verification

  • Run each updated targeted E2E locally with --workers=1.
  • Confirm at least one positive @mention file suggestion path uses files created inside the active test project.
  • Confirm the composer-dock tests use valid question data and test a stable user-visible scroll/dock contract.
  • Confirm release-notes zh test fails before the setup fix and passes after.
  • Confirm no product copy or runtime behavior changes unless a real regression is found and documented.

Execution mode

Agent can implement directly after v2026.5.11 release is complete.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P3Low prioritytaskNarrow execution, audit, spike, migration, tracking, or upstream follow-up work

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions