Skip to content

fix(core): clean up stale socket files before listening#34236

Merged
AgentEnder merged 1 commit intonrwl:masterfrom
brettburley:fix/stale-socket-cleanup
Feb 10, 2026
Merged

fix(core): clean up stale socket files before listening#34236
AgentEnder merged 1 commit intonrwl:masterfrom
brettburley:fix/stale-socket-cleanup

Conversation

@brettburley
Copy link
Copy Markdown
Contributor

Current Behavior

When running Nx tasks in CI environments (e.g., Buildkite) where the host's /tmp is mounted to containers, intermittent EADDRINUSE errors occur in PseudoIPCServer.init(). This happens because:

  1. PseudoIPCServer doesn't clean up its Unix socket file before calling listen()
  2. ForkedProcessTaskRunner.createPseudoTerminal() instantiates PseudoTerminal directly instead of using the createPseudoTerminal() helper, bypassing shutdown callback registration

When a new container starts with the same PID as a previous run (PID recycling), it generates the same socket path and hits EADDRINUSE because the stale socket file still exists.

Expected Behavior

No EADDRINUSE errors should occur. The PseudoIPCServer should defensively remove any stale socket file before attempting to listen, similar to how the daemon server handles this.

Related Issue(s)

Fixes #34233

@brettburley brettburley requested a review from a team as a code owner January 27, 2026 19:19
@brettburley brettburley requested a review from Cammisuli January 27, 2026 19:19
@netlify
Copy link
Copy Markdown

netlify Bot commented Jan 27, 2026

Deploy Preview for nx-docs ready!

Name Link
🔨 Latest commit f426083
🔍 Latest deploy log https://app.netlify.com/projects/nx-docs/deploys/698639b0d0717900084b9a28
😎 Deploy Preview https://deploy-preview-34236--nx-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@vercel
Copy link
Copy Markdown

vercel Bot commented Jan 27, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
nx-dev Ready Ready Preview Jan 27, 2026 7:25pm

Request Review

Copy link
Copy Markdown
Member

@AgentEnder AgentEnder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, going to let @FrozenPandaz weigh in before merging

@nx-cloud
Copy link
Copy Markdown
Contributor

nx-cloud Bot commented Feb 6, 2026

View your CI Pipeline Execution ↗ for commit 0b09a2c

Command Status Duration Result
nx affected --targets=lint,test,test-kt,build,e... ✅ Succeeded 48m 39s View ↗
nx run-many -t check-imports check-lock-files c... ✅ Succeeded 3m 2s View ↗
nx-cloud record -- nx-cloud conformance:check ✅ Succeeded 10s View ↗
nx-cloud record -- nx format:check ✅ Succeeded 2s View ↗
nx-cloud record -- nx sync:check ✅ Succeeded <1s View ↗

☁️ Nx Cloud last updated this comment at 2026-02-09 16:06:05 UTC

Copy link
Copy Markdown
Contributor

@nx-cloud nx-cloud Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

At least one additional CI pipeline execution has run since the conclusion below was written and it may no longer be applicable.

Nx Cloud has identified a possible root cause for your failed CI:

Our analysis shows both failures are environment-related and not caused by the PR's socket cleanup implementation. The e2e-workspace-create failure is a pre-existing vitest worker IPC issue (9.89% flaky, confirmed in master branch), while the e2e-gradle failure is an external Java toolchain registry timeout. Neither correlates with the Unix socket cleanup changes in this PR.

No code changes were suggested for this issue.

You can trigger a rerun by pushing an empty commit:

git commit --allow-empty -m "chore: trigger rerun"
git push

Nx Cloud View detailed reasoning on Nx Cloud ↗


🎓 Learn more about Self-Healing CI on nx.dev

@brettburley brettburley force-pushed the fix/stale-socket-cleanup branch from 12add2a to f426083 Compare February 6, 2026 18:57
@netlify
Copy link
Copy Markdown

netlify Bot commented Feb 6, 2026

👷 Deploy request for nx-dev pending review.

Visit the deploys page to approve it

Name Link
🔨 Latest commit 0b09a2c

@brettburley
Copy link
Copy Markdown
Contributor Author

Pushed to resolve merge conflicts.

Handles PID recycling in containers where a previous process
with the same PID left behind a socket file, causing EADDRINUSE errors.

- Added cleanupSocketFile() helper to remove stale socket files
- Updated PseudoIPCServer.init() to clean up before listening
- Refactored ForkedProcessTaskRunner to use createPseudoTerminal helper
- Added unit tests for socket cleanup behavior

Fixes nrwl#34233
@AgentEnder AgentEnder force-pushed the fix/stale-socket-cleanup branch from f426083 to 0b09a2c Compare February 7, 2026 21:02
@AgentEnder AgentEnder merged commit f5a7ea1 into nrwl:master Feb 10, 2026
20 checks passed
@brettburley brettburley deleted the fix/stale-socket-cleanup branch February 10, 2026 18:54
FrozenPandaz pushed a commit that referenced this pull request Feb 13, 2026
## Current Behavior

When running Nx tasks in CI environments (e.g., Buildkite) where the
host's /tmp is mounted to containers, intermittent EADDRINUSE errors
occur in PseudoIPCServer.init(). This happens because:

1. PseudoIPCServer doesn't clean up its Unix socket file before calling
listen()
2. ForkedProcessTaskRunner.createPseudoTerminal() instantiates
PseudoTerminal directly instead of using the createPseudoTerminal()
helper, bypassing shutdown callback registration

When a new container starts with the same PID as a previous run (PID
recycling), it generates the same socket path and hits EADDRINUSE
because the stale socket file still exists.

## Expected Behavior

No EADDRINUSE errors should occur. The PseudoIPCServer should
defensively remove any stale socket file before attempting to listen,
similar to how the daemon server handles this.

## Related Issue(s)

Fixes #34233

(cherry picked from commit f5a7ea1)
@github-actions
Copy link
Copy Markdown
Contributor

This pull request has already been merged/closed. If you experience issues related to these changes, please open a new issue referencing this pull request.

@github-actions github-actions Bot locked as resolved and limited conversation to collaborators Feb 16, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Intermittent EADDRINUSE errors in PseudoIPCServer when used in buildkite

2 participants