Skip to content

Fix macOS crash when accept() returns zero-length address#2389

Merged
danlapid merged 2 commits into
v2from
dlapid/fixup_macos
Aug 20, 2025
Merged

Fix macOS crash when accept() returns zero-length address#2389
danlapid merged 2 commits into
v2from
dlapid/fixup_macos

Conversation

@danlapid

@danlapid danlapid commented Aug 17, 2025

Copy link
Copy Markdown
Collaborator

As reported in cloudflare/workerd#4623

On macOS, accept() can return a valid file descriptor with addrlen == 0 when a connection is aborted during the accept (e.g., during "happy eyeballs" dual-stack connection attempts). This was causing workerd to crash with:

Fatal uncaught kj::Exception: kj/async-io.c++:3126:
failed: expected addrlen >= sizeof(addr->sa_family) [0 >= 1]

This fix:

  1. Checks for addrlen == 0 after accept() in async-io-unix.c++ and treats it as an aborted connection that should be retried (I was able to reproduce the error with the given stress test code before this change and was not able to after).
  2. Adds EINVAL to the list of acceptable errors when setting TCP_NODELAY on macOS, as it can be returned for non-TCP sockets. (I was consistently getting this in local dev when trying to repro).

Fixes cloudflare/workerd#4623

🤖 Generated with Claude Code

Tried to produce a test for this but was unable to as it seems to be a race condition.
At least I was able to reproduce the issue locally and prove that this change fixes it.

@danlapid danlapid requested a review from kentonv August 17, 2025 22:47
Comment thread c++/src/kj/async-io-unix.c++ Outdated
@kentonv

kentonv commented Aug 18, 2025

Copy link
Copy Markdown
Member

Let's use #2365, it's a more polished fix and includes a test. It just needs a couple comments addressed on the test.

@danlapid danlapid force-pushed the dlapid/fixup_macos branch from a441886 to 395da92 Compare August 18, 2025 14:28
@danlapid

Copy link
Copy Markdown
Collaborator Author

Pushed commit to use the exact code of https://github.com/capnproto/capnproto/pull/2365/files , no code changes as of yet so review comments are not yet addressed.

@danlapid danlapid force-pushed the dlapid/fixup_macos branch from 395da92 to 87ba999 Compare August 18, 2025 14:45
Comment thread c++/src/kj/async-io-test.c++
On macOS, accept() can return a valid file descriptor with addrlen=0 when a
connection is aborted during the accept (e.g., during "happy eyeballs" dual-stack
connection attempts). This is a bug in XNU (the macOS kernel).

This commit:
1. Adds special handling for zero-length address returns on macOS
2. Adds platform-specific checks for socket connection errors
3. Implements more robust error detection in socket accept logic
4. Includes comprehensive tests for aborted socket connections

The changes improve socket connection handling, particularly on macOS, by gracefully
managing scenarios where connections are aborted before being fully accepted.

Based on PR #2365 by Aaron O'Mullan
@danlapid danlapid force-pushed the dlapid/fixup_macos branch from 87ba999 to 85c8f90 Compare August 18, 2025 17:36
@danlapid danlapid requested a review from kentonv August 20, 2025 10:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

macOS: workerd crashes with dual-stack "happy eyeballs" HTTP requests

3 participants