Skip to content

fix(config): wait for background installs on dispose#3

Closed
Astro-Han wants to merge 12 commits into
devfrom
fix/config-scoped-deps
Closed

fix(config): wait for background installs on dispose#3
Astro-Han wants to merge 12 commits into
devfrom
fix/config-scoped-deps

Conversation

@Astro-Han

Copy link
Copy Markdown
Owner

Summary

  • wait for background config dependency installs during config instance teardown
  • cover the Windows config-install:win32 lock leak with a regression test that models disposal plus a follow-up install
  • keep the fix scoped to config lifecycle cleanup, without broad refactors

Root Cause

On Windows, config dependency installs share the global flock key config-install:win32. We started those installs in the background during config loading, but instance disposal did not wait for them to settle. A previous instance could therefore keep holding the lock after teardown, and later tests would time out waiting on the leaked lock.

Testing

  • bun test ./test/config/config.test.ts
  • bun test ./test/tool/registry.test.ts
  • bun test ./test/snapshot/snapshot.test.ts -t "snapshot state isolation between projects"
  • bun test ./test/project/worktree.test.ts

@Astro-Han

Copy link
Copy Markdown
Owner Author

Updated this PR with two follow-up commits:

  • ff8ec43be fix(config): wait for background installs on dispose
  • a201d0a91 fix(config): harden scoped dependency installs

What changed in this update:

  • Reworked config-scoped dependency installs so waiting on the Windows global config lock is interruptible, while the actual install phase remains protected once the lock is acquired.
  • Fixed the disposeAll() path so background config installs waiting on the lock do not hang instance teardown.
  • Restored real failure propagation from Config.waitForDependencies() instead of reducing install failures to warning logs only.
  • Stopped swallowing lock release errors.
  • Aligned the tool.registry caller with the new error semantics.
  • Tightened config tests by isolating OPENCODE_CONFIG_DIR cases from shared global config state.
  • Added and strengthened Windows regressions for:
    • disposing while a background config install is waiting on the win32 lock
    • aborting while waiting on the win32 lock

Re-verified locally after the latest changes:

  • bun run typecheck
  • bun test ./test/config/config.test.ts
  • bun test ./test/tool/registry.test.ts
  • bun test ./test/provider/provider.test.ts -t "plugin config providers persist after instance dispose"

Also ran a fresh-eyes review loop on the final diff. The last review came back with no P0/P1 findings.

@Astro-Han

Copy link
Copy Markdown
Owner Author

Quick status update on this PR.

The scope has narrowed quite a bit. The recent commits have all been in the same area: config-scoped dependency installs, tool discovery timing, and Windows lock cleanup. The latest commit adds a retry in Flock.release() for transient lock-dir removal errors on Windows, plus a regression test for that path. Before pushing, I re-ran bun turbo typecheck and bun turbo test:ci locally on macOS, and both passed.

The current blocker in the latest CI run is on Ubuntu. It fails in packages/opencode/test/tool/registry.test.ts, in the case does not wait for unrelated global config installs before importing local tools with bare imports. That test is specifically checking that local tool discovery does not get blocked by an unrelated global install. In CI it timed out waiting for the local tool id late, so that expectation is still not being met there.

Windows is still running as I post this. bun install --frozen-lockfile and typecheck have completed, and the test step is in progress. Once that job finishes, I will know whether we are looking at the same wait-path on both platforms or one Ubuntu-specific issue plus something separate on Windows.

Assuming Windows does not reveal a different failure first, the next fix should stay narrow: adjust the registry/config wait boundary so unrelated global installs do not stall local tool loading.

@Astro-Han

Copy link
Copy Markdown
Owner Author

Update on the last four commits:

  • 0f7c21c only changes CI and the registry timing expectation. I disabled the Windows Bun cache in .github/workflows/ci.yml and tightened the related registry test setup after the runner kept behaving differently from Ubuntu.
  • 4426727 fixes a Windows lock handoff race in config-scoped dependency installs. This is the commit that hardens the flock path and adds focused regression coverage for that handoff.
  • 1c800b7 moves one layer up and stabilizes Windows cleanup around config, worktree, and local tool loading. That commit adds regression tests for transient cleanup failures and for config-scoped local tool installs.
  • eafddd5 closes the remaining tui.test.ts flake. The test tmpdir now waits for background TuiConfig installs before cleanup, and the one test in that file that also calls Config.get() now waits for Config.waitForDependencies() as well.

So the recent work is not four unrelated patches. It narrows down the Windows failures to one class of problems: background installs, lock handoff, and temp-dir cleanup happening in the wrong order on the Windows runner.

Latest local verification on this branch is green:

  • bun turbo typecheck
  • bun turbo test:ci

Most recent local test:ci result: 1932 pass / 11 skip / 1 todo / 0 fail.

@Astro-Han Astro-Han force-pushed the fix/config-scoped-deps branch from b1c3a18 to 79e9842 Compare April 17, 2026 05:57
@Astro-Han

Copy link
Copy Markdown
Owner Author

关闭,改走拆分路径:

原始意图(Instance dispose 等待 background install)是有实质的,但 12 个 commit 滚雪球叠了三类无关改动:

  1. flock/worktree EBUSY 重试 — 全是 synthetic mock 测试,缺真实 CI 日志佐证
  2. flock 'handoff microtask race' — 理论构造,测试靠模块重写注入钩子
  3. Windows 跳 bun cache — 无对照数据

CI 连红 15+ 小时,最近的 revert→restore 类型错误说明已经在改不理解的代码。

后续拆:

  • fix(config): interrupt background installs on dispose ← 保留 fiber 化
  • fix(worktree): retry Windows transient rm failures ← 先附真实 EBUSY CI log 再提
  • 其余不再单提

PR #6 已经证明同一 scope 可以做到 200 行聚焦修复,按那个粒度推进。

@Astro-Han Astro-Han closed this Apr 17, 2026
@Astro-Han Astro-Han deleted the fix/config-scoped-deps branch April 17, 2026 16:39
Astro-Han added a commit that referenced this pull request May 13, 2026
…d frame

AstroHan W1 second retest flagged 3 structural regressions on the W1
leaves; this commit fixes the trio in one atomic change because all
three are CSS-only.

1. **Text selection regression (#1)** — The app shell sets `select-none`
   on the timeline body (layout.tsx L2320) and opts back in for input,
   textarea, and `[contenteditable]` only. The legacy carve-out at
   message-part.css L710 named `[data-component="user-message"]
   [data-slot="user-message-text"]` and `[data-component="text-part"]`
   so legacy bubble + assistant prose stayed selectable; the W1 leaves
   replaced both data-components without an equivalent carve-out, so
   bubble text and agent prose stopped responding to drag-select.
   Reinstated by adding `user-select: text` + `cursor: text` to
   `[data-slot="bubble-text"]`, `[data-slot="agent-prose"]`, and
   `[data-slot="agent-reasoning"]`.

2. **Trow chev direction (#3)** — The W1 icon is `chevron-down` (rest
   reads "click to expand"), but the open-state rotation was 90°,
   which pointed the glyph sideways. Per AstroHan's flag the open
   chev should read "click to fold up" — 180° rotates `chevron-down`
   to chevron-up, restoring the disclosure idiom.

3. **Trow expanded nested frame (#4)** — DESIGN.md L417 scopes the
   transparent + 1px `--border-weaker` + radius-sm body frame to each
   per-tool result body, not the trow group wrapper. The W1 trow-block
   wrapped the inner sub-rows in its own frame, which composed with
   the legacy per-tool chrome and produced a visible double border.
   Dropped the wrapper's border + padding so it only owns the vertical
   rhythm between sub-rows; the per-tool body keeps its own frame.

Tests still pass: `session-turn-user-bubble`, `session-turn-agent-round`,
and `session-turn-trow-block.reducer` suites are CSS-grep + reducer
based and do not assert any of the changed declarations.
Astro-Han added a commit that referenced this pull request May 13, 2026
… WebFetch parity

AstroHan's fourth W1 retest (msg=ac13481a) flagged three trow visuals.

**#1 — Missing expand/collapse animation.** The header comment on
`session-turn-trow-block.css` claimed `session-turn.css` owned the
`::details-content` rule, but it was never actually wired anywhere.
Add the rule (mirror of preview `message-flow.html` L13-23): height +
opacity transition gated by `prefers-reduced-motion: no-preference`,
with `transition-behavior: allow-discrete` covering the discrete
`content-visibility` step. The `:root { interpolate-size:
allow-keywords }` declaration in utilities.css is what lets
`height: 0` ↔ `height: auto` interpolate; the rule silently no-ops to
a snap without it. Inner Collapsible.Content gets a matching scoped
animation — Solid's `collapsible.css` ships its keyframes commented
out (L92-101), so without the override the per-tool body would snap
open. The scoped selector keeps the file-tree / settings Collapsibles
on their default behaviour.

**#2 — Tighter expanded body.** AstroHan reported the expanded trow
body felt too loose vs preview. `.mf-tlist` in
`docs/design/preview/message-flow.html` L199 locks gap at 6px between
sub-tool rows; trow-body now matches (was 8px).

**#3 — WebFetch layout parity.** WebFetch is the only tool that ships
a right-aligned `[data-component="tool-action"]` arrow inside its
trigger row (`message-part-tools-basic.tsx` L142). The other tools
use the plain `trigger={{ title, subtitle }}` object form, so they
already flow left without the action chrome. Suppress the action icon
only inside the trow result body — standalone WebFetch outside the
trow keeps the affordance.
Astro-Han added a commit that referenced this pull request May 25, 2026
…resize sync (#887)

Three connected fixes for the right-panel titlebar seam, reported post-release as a follow-up to #880. All three only become visible together: once the seam aligns, the toggle slide and the drag-resize lag become readable.

**Fix 1 — macOS seam alignment (8px)**
`padding-inline: 8px` → `padding-inline-start: 8px` on the macOS `[data-component="titlebar-shell"]` rule (`packages/app/src/index.css`). The symmetric 8px padding pushed the right rail 8px inboard of the viewport while the `<aside>` directly below reached the actual edge — `border-l`s ended up 8px apart on macOS only. Windows already correct.

**Fix 2 — Stable toggle position across open/close**
`#pawwork-titlebar-right` (`packages/app/src/components/titlebar.tsx`) changed from in-flow flex sibling of the tabs slot to absolute-positioned at the rail's top-right corner (`right-2` macOS / `right-0` Windows). PR #880's flex layout slid the toggle by `--right-panel-width` on a 240ms transition; that motion read as jarring once fix #1 aligned the seam.

**Fix 3 — Drag-resize seam sync**
New effect in `SessionSidePanel` mirrors `props.size.active()` to a `data-resizing-right-panel` attribute on `<desktop-shell>`; matching CSS rule in `index.css` disables the shell's transition while the attribute is present. During drag the body's inline `width` snapped per-pointermove while the shell's `--right-panel-width` kept its 240ms transition — the two sides of the seam moved at different speeds. Attribute-bridge avoids plumbing session-scope sizing state into the global layout context.

**Hardening from code review**
- P1 (toggle/+ collision risk): added `padding-right: 44px` reserve on `#pawwork-titlebar-tabs` when the rail is active (toggle 30 + `right-2` 8 + gap 4 + 2px buffer for Tabs.List `px-1`) so the no-collision contract no longer depends on Kobalte's Tabs.List rendering at content-width.
- P2 (attribute leak): effect now unconditionally `removeAttribute` first then re-evaluates, plus `onCleanup` clear. Closes leaks via `isDesktop()` flip, component unmount, and stale residue.
- P3 (test quality): tightened tolerance to 1px, replaced `waitForTimeout` with `expect.poll`, added `+` click test and a collision-guard that runtime-injects `width: 100% !important` on Tabs.List.
- Bot-review follow-up: x-axis seam test now polls for steady state (panel open/close runs a 240ms transition that the drag-only gate doesn't suppress); toggle test normalizes `aria-expanded` to false before sampling closed state.

**Regression tests** (`packages/app/e2e/session/titlebar-right-rail-contract.spec.ts`, 8 total)
- `border-l aligns horizontally with right-panel body` — pre-fix #1: `Received: 8`; post: passes
- `toggle keeps the same x-position across open/close` — pre-fix #2: `Received: 229.25`; post: <1px
- `+ add-tab button stays clickable and does not close the panel`
- `+ stays clickable even if Tabs.List is forced to fill the slot (collision guard)`
- `data-resizing-right-panel attribute does not leak when viewport drops below the desktop breakpoint mid-resize state`
- Plus 3 pre-existing tabs-rail contract tests

**Verification**
- `bun run test:e2e:local titlebar-right-rail-contract.spec.ts` → 8/8 pass
- `bun run snap right-panel-titlebar` → pass, no visual regression
- `bun run lint` → clean
- `bun --cwd packages/app run typecheck` → clean
- Manual Electron (macOS): toggle stable across open/close, drag-resize seam moves as one continuous line
- CI: 27/27 green

**Follow-up tracked separately**
- #889 — `session-side-panel.tsx` is 604 lines and growing; refactor scheduled in a dedicated PR.

**Risk notes**
- Fix #1 macOS-only; fixes #2 and #3 affect both platforms.
- Closed-state toggle inset changes from 16px to 8px on macOS, matching native toolbar conventions.
- Open-state 240ms toggle slide is removed; body width still animates as before.
- No `.github/workflows/` touched.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant