fix(locks): reclaim stale session locks across reboot#30525
fix(locks): reclaim stale session locks across reboot#30525liuxiaopai-ai wants to merge 1 commit intoopenclaw:mainfrom
Conversation
|
Nice approach using boot ID to solve the PID reuse problem! This is a classic issue with file-based locking that I've run into before. A few thoughts on the implementation: 1. Cross-platform boot ID sourcing One mitigation might be to combine boot_id with a container-specific identifier when available: const containerId = process.env.HOSTNAME || process.env.CONTAINER_ID;
const effectiveBootId = containerId ? `${bootId}:${containerId}` : bootId;2. macOS behavior 3. Lock file format versioning interface LockPayload {
version: 2; // bumped from implicit v1
pid: number;
createdAt: number;
bootId?: string;
}4. Edge case: boot_id changes mid-session 5. Testing suggestion Overall LGTM - the boot_id approach is much more robust than PID+age alone. The edge cases I mentioned are mostly theoretical and the fallback behavior handles them gracefully. |
Greptile SummaryAdds Linux boot ID tracking to session lock files to solve the PID reuse problem after hard reboots. Lock payloads now include
The change is well-scoped and doesn't modify scheduler/session routing logic. Test coverage includes both acquisition and cleanup flows, verifying that mismatched boot IDs trigger reclamation while matching boot IDs preserve the lock. Confidence Score: 5/5
Last reviewed commit: d01660f |
|
Thanks for the PR! Multiple PRs address stale lock recovery. Keeping #29118 as the earliest and most scoped submission. Closing this one to reduce noise. This is an AI-assisted triage review. If we got this wrong, feel free to reopen or start a new PR — happy to revisit. |
Summary
Describe the problem and fix in 2–5 bullets:
pidliveness + age, so after hard reboot a reused PID could make a stale lock look active.session file lockedeven though no real OpenClaw process held the lock.bootId; lock inspection marks locks stale onboot-id-mismatch; stale cleanup and lock acquisition reclaim those locks.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
User-visible / Behavior Changes
Security Impact (required)
Yes/No) NoYes/No) NoYes/No) NoYes/No) NoYes/No) NoYes, explain risk + mitigation:Repro + Verification
Environment
Steps
pid, freshcreatedAt, and mismatchedbootId.acquireSessionWriteLockfor the same session path.Expected
Actual
Evidence
Attach at least one:
pnpm vitest src/agents/session-write-lock.test.ts src/commands/doctor-session-locks.test.ts✓ src/agents/session-write-lock.test.ts (17 tests)✓ src/commands/doctor-session-locks.test.ts (2 tests)Human Verification (required)
What you personally verified (not just CI), and how:
Compatibility / Migration
Yes/No) YesYes/No) NoYes/No) NoFailure Recovery (if this breaks)
d01660fb13.src/agents/session-write-lock.ts,src/agents/session-write-lock.test.ts,CHANGELOG.md.Risks and Mitigations
List only real risks for this PR. Add/remove entries as needed. If none, write
None.