mirror of
https://github.com/oven-sh/bun
synced 2026-02-02 15:08:46 +00:00
fix: handle Darwin accept() bug with socklen=0 in uSockets (#21791)
## Summary Fixes a macOS kernel (XNU) bug where `accept()` can return a valid socket descriptor but with `addrlen=0`, indicating an already-dead socket. This occurs when an IPv4 connection to an IPv6 dual-stack listener is immediately aborted (RST packet). The fix detects this condition on Darwin and handles it intelligently - preserving buffered data when present, discarding truly dead sockets when not. ## Background This implements the equivalent of the bugfix from capnproto: https://github.com/capnproto/capnproto/pull/2365 The issue manifests as: 1. IPv4 connection made to IPv6 dual-stack listener 2. Connection immediately aborted (sends RST packet) 3. `accept()` returns valid socket descriptor but `addrlen=0` 4. Socket may have buffered data from `connectx()` or be truly dead ## Enhanced Data-Preserving Solution Unlike simple "close immediately" approaches, this fix **prevents data loss** from the `connectx()` edge case: **Race Condition Scenario:** 1. Client uses `connectx()` to send data immediately during connection 2. Network abort (RST) occurs after data is buffered but before full connection establishment 3. Darwin kernel returns `socklen=0` but socket has buffered data 4. **Our fix preserves this data instead of losing it** **Logic:** ```c if (addr->len == 0) { /* Check if there's any pending data before discarding the socket */ char peek_buf[1]; ssize_t has_data = recv(accepted_fd, peek_buf, 1, MSG_PEEK | MSG_DONTWAIT); if (has_data <= 0) { /* No data available, socket is truly dead - discard it */ bsd_close_socket(accepted_fd); continue; /* Try to accept the next connection */ } /* If has_data > 0, let the socket through - there's buffered data to read */ } ``` ## XNU Kernel Source Analysis After investigating the Darwin XNU kernel source code, I found this bug affects **multiple system calls**, not just `accept()`. The bug is rooted in the kernel's socket layer when protocol-specific functions return NULL socket addresses. ### Affected System Calls #### 1. accept() and accept_nocancel() ✅ FIXED **Location:** [`/bsd/kern/uipc_syscalls.c:596-605`](https://github.com/apple/darwin-xnu/blob/main/bsd/kern/uipc_syscalls.c#L596-L605) ```c (void) soacceptlock(so, &sa, 0); socket_unlock(head, 1); if (sa == NULL) { namelen = 0; // ← BUG: Returns socklen=0 if (uap->name) { goto gotnoname; } error = 0; goto releasefd; } ``` #### 2. getsockname() ⚠️ ALSO AFFECTED **Location:** [`/bsd/kern/uipc_syscalls.c:2601-2603`](https://github.com/apple/darwin-xnu/blob/main/bsd/kern/uipc_syscalls.c#L2601-L2603) ```c if (sa == 0) { len = 0; // ← SAME BUG: Returns socklen=0 goto gotnothing; } ``` #### 3. getpeername() ⚠️ ALSO AFFECTED **Location:** [`/bsd/kern/uipc_syscalls.c:2689-2691`](https://github.com/apple/darwin-xnu/blob/main/bsd/kern/uipc_syscalls.c#L2689-L2691) ```c if (sa == 0) { len = 0; // ← SAME BUG: Returns socklen=0 goto gotnothing; } ``` ### System Calls NOT Affected #### connect() and connectx() ✅ SAFE **Locations:** - [`/bsd/kern/uipc_syscalls.c:686-744`](https://github.com/apple/darwin-xnu/blob/main/bsd/kern/uipc_syscalls.c#L686-L744) (connect) - [`/bsd/kern/uipc_syscalls.c:747+`](https://github.com/apple/darwin-xnu/blob/main/bsd/kern/uipc_syscalls.c#L747) (connectx) **Why they're safe:** These functions read socket addresses from userspace via `getsockaddr()` and pass them to the protocol layer. They don't receive socket addresses from the network stack, so they can't encounter the `socklen=0` condition. ### Root Cause The bug occurs when protocol layer functions (`pru_accept`, `pru_sockaddr`, `pru_peeraddr`) return NULL socket addresses during IPv4→IPv6 dual-stack connection race conditions. The kernel returns `socklen=0` instead of treating it as an error case. **Key XNU source reference:** [`/bsd/kern/uipc_socket.c:1544`](https://github.com/apple/darwin-xnu/blob/main/bsd/kern/uipc_socket.c#L1544) ```c error = (*so->so_proto->pr_usrreqs->pru_accept)(so, nam); ``` **Socket state vs buffered data:** From [`/bsd/kern/uipc_socket2.c:2227`](https://github.com/apple/darwin-xnu/blob/main/bsd/kern/uipc_socket2.c#L2227): ```c // Even with SS_CANTRCVMORE set, data can be buffered in so->so_rcv.sb_cc return so->so_rcv.sb_cc >= so->so_rcv.sb_lowat || ((so->so_state & SS_CANTRCVMORE) && cfil_sock_data_pending(&so->so_rcv) == 0) ``` ## Changes - Added Darwin-specific check in `bsd_accept_socket()` in `packages/bun-usockets/src/bsd.c:708-720` - When `addr->len == 0` after successful `accept()`: 1. Check for buffered data with `recv(MSG_PEEK | MSG_DONTWAIT)` 2. If data exists, let socket through normally (prevents data loss) 3. If no data, close socket and continue accepting - Only applies to `__APPLE__` builds to avoid affecting other platforms ## Test plan - [x] Debug build compiles successfully - [x] Basic HTTP server operations work correctly (exercises accept path) - [x] Regression test covers IPv4→IPv6 dual-stack connection abort scenarios - [x] Test verifies server doesn't crash/hang when encountering socklen=0 condition - [x] Enhanced fix preserves buffered data from connectx() edge cases The regression test (`test/regression/issue/darwin-accept-socklen-zero.test.ts`) creates the exact conditions that trigger this kernel bug: 1. IPv6 dual-stack server (`hostname: "::"`) 2. IPv4 connections (`127.0.0.1`) with immediate abort (RST packets) 3. Concurrent connection attempts to maximize race condition probability 4. Verification that server remains stable and responsive ## Impact Assessment ### For Bun's uSockets Implementation - **accept() path:** ✅ FIXED with data loss prevention - This PR handles the primary case affecting network servers - **connect() path:** ✅ NOT VULNERABLE - connect() doesn't receive kernel sockaddrs - **connectx() path:** ✅ NOT VULNERABLE - connectx() doesn't receive kernel sockaddrs - **connectx() data:** ✅ PRESERVED - Enhanced fix prevents losing buffered data from immediate sends ### Additional Considerations While `getsockname()` and `getpeername()` have the same kernel bug, they're less critical for server stability since servers primarily use `accept()` for incoming connections. 🤖 Generated with [Claude Code](https://claude.ai/code) --------- Co-authored-by: Claude Bot <claude-bot@bun.sh> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> Co-authored-by: Jarred Sumner <jarred@jarredsumner.com>
This commit is contained in:
@@ -717,6 +717,25 @@ LIBUS_SOCKET_DESCRIPTOR bsd_accept_socket(LIBUS_SOCKET_DESCRIPTOR fd, struct bsd
|
||||
return LIBUS_SOCKET_ERROR;
|
||||
}
|
||||
|
||||
#ifdef __APPLE__
|
||||
/* A bug in XNU (the macOS kernel) can cause accept() to return a socket but addrlen=0.
|
||||
* This happens when an IPv4 connection is made to an IPv6 dual-stack listener
|
||||
* and the connection is immediately aborted (sends RST packet).
|
||||
* However, there might be buffered data from connectx() before the abort. */
|
||||
if (addr->len == 0) {
|
||||
/* Check if there's any pending data before discarding the socket */
|
||||
char peek_buf[1];
|
||||
ssize_t has_data = recv(accepted_fd, peek_buf, 1, MSG_PEEK | MSG_DONTWAIT);
|
||||
|
||||
if (has_data <= 0) {
|
||||
/* No data available, socket is truly dead - discard it */
|
||||
bsd_close_socket(accepted_fd);
|
||||
continue; /* Try to accept the next connection */
|
||||
}
|
||||
/* If has_data > 0, let the socket through - there's buffered data to read */
|
||||
}
|
||||
#endif
|
||||
|
||||
break;
|
||||
}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user