Commit Graph

25 Commits

Author SHA1 Message Date
Claude Bot
6273180892 fix(autokill): continue all passes even when enumeration fails
Remove early returns when children.len == 0 that were aborting the
entire autokill sequence. Now treat empty slices as "nothing to do this
pass" and continue to the next pass, allowing later passes to retry
enumeration and send SIGSTOP/SIGKILL as intended.

This fixes the case where enumeration failures (which return empty
slices) would prevent subsequent passes from running, potentially
leaving processes alive.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-07 05:53:32 +00:00
Claude Bot
88e1f84317 fix(autokill): use direct Linux syscalls for musl compatibility
Use std.os.linux.kill() directly on Linux instead of std.c.kill() to
bypass potential musl libc issues where processes were not being killed
properly during the three-pass autokill sequence.

Tests were failing on musl/Alpine showing processes still alive after
autokill. Direct syscalls avoid any libc wrapper issues.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-07 05:46:40 +00:00
Claude Bot
a8b8294cb4 docs: Document three-pass kill strategy rationale
Added comprehensive documentation explaining the three-pass approach:

Why Three Passes:
1. SIGTERM first: Allows graceful cleanup (close files, flush buffers)
   - Most processes (60-80%) exit here
   - Early bailout makes this faster in practice

2. SIGSTOP second: Freeze survivors to prevent reparenting races
   - Maintains the race-safety from original two-pass design
   - Fresh enumeration catches any children spawned after SIGTERM

3. SIGKILL third: Force termination of frozen processes
   - Ensures nothing survives
   - Cannot be caught or ignored

The three-pass strategy is superior to the original two-pass (SIGSTOP→SIGKILL):
- More graceful: Allows cleanup handlers
- Better performance: Early bailout when processes respect SIGTERM
- Still safe: SIGSTOP prevents races
- More thorough: Fresh enumeration between each pass

All 13 tests pass. Strategy was explicitly requested during implementation.
2025-10-07 05:32:35 +00:00
Claude Bot
dc728ebeea fix: Use try-then-fallback approach for musl/Alpine compatibility
Changed from hardcoded musl detection to graceful fallback:
- Try /proc/{pid}/task/{tid}/children first (fast path)
- If it fails for ANY reason, fall back to /proc scanning
- Works on all systems: glibc, musl, old kernels, Alpine, etc.

Why this is better:
- No assumptions about musl behavior
- Automatically handles kernel differences
- /proc/{pid}/task/{tid}/children works on Alpine if available
- Falls back gracefully if not
- Simpler code without platform-specific branches

This should fix the musl test failures by letting the system tell us
what works rather than guessing based on libc implementation.
2025-10-07 04:33:53 +00:00
Claude Bot
b511d93fe2 perf: Optimize three-pass strategy with fresh enumeration and early bailout
Improvements:
1. Reduced delay from 10ms to 500 microseconds (20x faster)
   - Still enough time for processes to handle SIGTERM
   - Much faster exit in the common case

2. Fresh child enumeration per pass
   - Don't keep stale PIDs around between passes
   - Each pass gets current state of process tree
   - Scoped blocks ensure immediate cleanup of allocations

3. Early bailout optimization
   - Return immediately if no children found in Pass 1
   - Skip Pass 2 if all children exited from SIGTERM
   - Skip Pass 3 if all children exited from SIGSTOP

Benefits:
- Faster: 500us delay instead of 10ms
- More accurate: Fresh child list each pass
- More efficient: Skip unnecessary passes when children exit early
- Cleaner memory: Scoped blocks free resources immediately
2025-10-07 00:24:34 +00:00
Claude Bot
58d2abc593 feat: Implement three-pass kill strategy for graceful cleanup
Changed from two-pass to three-pass strategy:

Pass 1: SIGTERM - Give processes a chance to handle graceful cleanup
  - Allows signal handlers to run
  - 10ms delay to process cleanup work

Pass 2: SIGSTOP - Freeze remaining processes to prevent reparenting
  - Stops processes that didn't exit from SIGTERM
  - Prevents race conditions during tree traversal

Pass 3: SIGKILL - Force termination of any remaining processes
  - Ensures all processes are killed
  - Cannot be caught or ignored

Benefits:
- More graceful: Processes get a chance to clean up (close files, flush buffers)
- More reliable: SIGSTOP prevents reparenting races
- Still thorough: SIGKILL ensures nothing escapes

The enum-based approach (.sigterm, .sigstop, .sigkill) is clearer than the
previous boolean 'stop_only' parameter.
2025-10-07 00:15:18 +00:00
Claude Bot
fe61519b49 fix: Continue killing process even when child enumeration fails
In killProcessTreeRecursive, if getChildPids fails (e.g., transient /proc read
failures, race conditions), we were returning early without sending any signals
to that PID, allowing that branch of the process tree to survive.

Now we treat enumeration failures as "no children" and continue to send
SIGSTOP/SIGTERM/SIGKILL to the process itself. This ensures we always attempt
to kill the process even if we can't enumerate its children, making autokill
more reliable under adverse conditions.
2025-10-06 17:16:31 +00:00
Claude Bot
21a1a4bfcd fix: Consistent error handling - fall back to /proc scan on read failures
Previously, if /proc/{pid}/task/{tid}/children could be opened but not read
(e.g., unreadable or >4096 bytes), we returned an empty list instead of
falling back to the /proc scan. This was inconsistent with file-open failures
which properly fell back.

Now all read failures trigger the same fallback path, ensuring we always
attempt to find children even when the fast path fails.
2025-10-06 17:02:36 +00:00
Claude Bot
3a809203a3 fix: Use std.posix.SIG for truly portable signal constants
CRITICAL FIX: bun.SignalCode enum has hardcoded Linux signal values, which
breaks on macOS where signals have different numbers:
- Linux: SIGSTOP=19, SIGTERM=15, SIGKILL=9
- macOS: SIGSTOP=17, SIGTERM=15, SIGKILL=9

Changed from:
- @intFromEnum(bun.SignalCode.SIGSTOP) // Always 19
To:
- std.posix.SIG.STOP // 17 on macOS, 19 on Linux

std.posix.SIG provides platform-specific signal constants that automatically
use the correct values for each OS via the Zig standard library's platform
detection. This ensures:
1. SIGSTOP actually freezes processes on macOS (not SIGTSTP which is catchable)
2. The two-pass freeze-then-kill strategy works correctly on all platforms
3. No platform-specific conditional code needed
2025-10-06 16:47:32 +00:00
Claude Bot
09e62f6877 fix: Use portable signal constants instead of hardcoded numbers
Replaced hardcoded signal numbers (19, 15, 9) with bun.SignalCode enum:
- SIGSTOP (was 19): Correct on all platforms, avoiding Linux-specific assumptions
- SIGTERM (was 15): Consistent across platforms
- SIGKILL (was 9): Guaranteed termination signal

This fixes a critical bug on macOS where signal 19 is SIGTSTP (stoppable via
signal handler) instead of SIGSTOP, which would break the freeze-then-kill
two-pass strategy. Using bun.SignalCode ensures correct signal values across
all supported platforms.
2025-10-06 16:33:03 +00:00
Claude Bot
58c3b6bcd9 fix: Improve autokill robustness and error handling
1. Allow pass 2 to continue even if pass 1 fails: Changed early returns to
   return empty slices so that if child enumeration fails in pass 1, we still
   attempt termination in pass 2. This makes the two-pass strategy more robust.

2. Better allocation failure handling: When allocPrint fails in getChildPids,
   fall back to /proc scanning instead of returning an empty list. This
   distinguishes between "no children" and "allocation failed".

3. Document macOS buffer limitation: Added comment explaining the 2048 child
   limit on macOS and why it's acceptable for autokill's use case.

These changes ensure autokill attempts termination even under adverse
conditions like memory pressure or /proc read errors.
2025-10-06 16:17:09 +00:00
Claude Bot
6f72ad90ae fix: Implement two-pass SIGSTOP/SIGKILL strategy
The code now properly implements the two-pass strategy:
1. Pass 1: Send SIGSTOP to freeze the entire process tree and minimize reparenting races
2. Pass 2: Send SIGTERM followed by SIGKILL to terminate all frozen processes

Each pass uses its own visited set (seen_stop, seen_kill) to track processed PIDs.
This prevents race conditions where child processes might reparent to init during cleanup.
2025-10-06 15:58:03 +00:00
Claude Bot
a1458ca9d8 fix: Address critical autokill review issues
- Only call autokill on main thread to avoid killing children during worker teardown
- Remove process group kill (-pid) that was killing Bun itself before shutdown
- Add Windows platform guards to skip Unix-specific autokill tests

Fixes:
1. VirtualMachine.onExit now checks isMainThread() before calling killAllChildProcesses
2. autokill.zig no longer uses kill(-pid) which killed Bun's own process
3. test/cli/autokill.test.ts now skips on Windows with describe.skipIf(isWindows)
2025-10-06 15:40:05 +00:00
autofix-ci[bot]
3f6ca7271a [autofix.ci] apply automated fixes 2025-10-06 15:10:27 +00:00
Claude Bot
8f2b66242a Merge main into claude/autokill-flag
Resolved conflicts:
- src/cli/Arguments.zig: Added both --autokill and --user-agent flags, merged CA store logic
- src/sys.zig: Imported both autokill and PosixStat modules
- cmake/sources/ZigSources.txt: Removed as deleted in main
2025-10-06 15:01:06 +00:00
robobun
46e7a3b3c5 Implement birthtime support on Linux using statx syscall (#23209)
## Summary

- Adds birthtime (file creation time) support on Linux using the `statx`
syscall
- Stores birthtime in architecture-specific unused fields of the kernel
Stat struct (x86_64 and aarch64)
- Falls back to traditional `stat` on kernels < 4.11 that don't support
`statx`
- Includes comprehensive tests validating birthtime behavior

Fixes #6585

## Implementation Details

**src/sys.zig:**
- Added `StatxField` enum for field selection
- Implemented `statxImpl()`, `fstatx()`, `statx()`, and `lstatx()`
functions
- Stores birthtime in unused padding fields (architecture-specific for
x86_64 and aarch64)
- Graceful fallback to traditional stat if statx is not supported

**src/bun.js/node/node_fs.zig:**
- Updated `stat()`, `fstat()`, and `lstat()` to use statx functions on
Linux

**src/bun.js/node/Stat.zig:**
- Added `getBirthtime()` helper to extract birthtime from
architecture-specific storage

**test/js/node/fs/fs-birthtime-linux.test.ts:**
- Tests non-zero birthtime values
- Verifies birthtime immutability across file modifications
- Validates consistency across stat/lstat/fstat
- Tests BigInt stats with nanosecond precision
- Verifies birthtime ordering relative to other timestamps

## Test Plan

- [x] Run `bun bd test test/js/node/fs/fs-birthtime-linux.test.ts` - all
5 tests pass
- [x] Compare behavior with Node.js - identical behavior
- [x] Compare with system Bun - system Bun returns epoch, new
implementation returns real birthtime

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Bot <claude-bot@bun.sh>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-10-04 04:57:29 -07:00
Michael H
ba20670da3 implement pnpm migration (#22262)
### What does this PR do?

fixes #7157, fixes #14662

migrates pnpm-workspace.yaml data to package.json & converts
pnpm-lock.yml to bun.lock

---

### How did you verify your code works?

manually, tests and real world examples

---------

Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Dylan Conway <dylan.conway567@gmail.com>
2025-09-27 00:45:29 -07:00
taylor.fish
437e15bae5 Replace catch bun.outOfMemory() with safer alternatives (#22141)
Replace `catch bun.outOfMemory()`, which can accidentally catch
non-OOM-related errors, with either `bun.handleOom` or a manual `catch
|err| switch (err)`.

(For internal tracking: fixes STAB-1070)

---------

Co-authored-by: Dylan Conway <dylan.conway567@gmail.com>
2025-08-26 12:50:25 -07:00
Claude Bot
84bcb5fe4b Fix autokill implementation for musl compatibility
Improve the autokill process killing mechanism to be more reliable on musl systems:

1. Use process group signals first (kill(-pid, signal)) to catch processes
   that may not be properly detected via proc filesystem
2. Apply both SIGTERM and SIGKILL with timing to ensure termination
3. Process the kill tree depth-first to avoid race conditions
4. Add fallback mechanisms for different libc implementations

This should resolve the failing tests where child processes were not being
properly terminated on musl systems.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-24 08:14:23 +00:00
Claude Bot
f8f971d3e6 optimize: Remove redundant getpid() calls from macOS autokill code
Completed the optimization to call getpid() only once by:
- Updated getChildPids() to accept current_pid parameter
- Updated getChildPidsFallback() to accept current_pid parameter
- Removed redundant getpid() call in macOS-specific code path
- Verified proper libproc function usage (proc_listpids with PROC_PPID_ONLY)

All cross-platform builds pass including macOS. The macOS implementation
continues to use the proper libproc APIs as intended.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-21 05:06:54 +00:00
Claude Bot
8b83b72883 fix: Implement proper SIGSTOP/SIGKILL sequence for autokill
Fixed the autokill implementation to use the correct two-pass approach:
1. First pass: SIGSTOP all processes in the tree to freeze them
2. Second pass: SIGKILL all processes to actually terminate them

This prevents race conditions where killing a parent first could cause
children to be reparented to init and become harder to track.

Also optimized to only call getpid() once and pass the current PID
down through the recursive calls instead of calling it repeatedly.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-21 04:53:17 +00:00
Claude Bot
48afc83936 fix: Make autokill functionality work on Linux and improve tests
This commit fixes the autokill flag implementation and tests to work
properly on Linux systems:

## Changes Made

### autokill.zig
- Fixed deprecated `std.mem.tokenize` usage by replacing with `std.mem.tokenizeAny`
- Improved `killProcessTreeRecursive` to kill parent first to prevent race conditions
- Added better error handling to prevent infinite loops
- Added validation for pid <= 0 to avoid invalid process operations

### autokill.test.ts
- Completely rewrote tests using proper test harness patterns
- Used `Bun.spawn` with `await using` pattern and `bunEnv` for consistency
- Created focused, reliable tests that don't have timing issues
- Added comprehensive test coverage:
  - Basic autokill flag functionality
  - Child process killing verification
  - Comparison with non-autokill behavior
  - Nested process handling

## Testing
All tests now pass reliably on Linux. The autokill functionality works
correctly for both direct children and nested process trees.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-21 04:42:49 +00:00
Jarred Sumner
f56e220148 feat: Add --autokill flag to recursively kill child processes on exit
This adds a new --autokill CLI flag that ensures all child processes spawned
by Bun are recursively terminated when the parent process exits. This prevents
orphaned processes from continuing to run after Bun exits.

## Implementation

The implementation is in Zig (src/sys/autokill.zig) and uses platform-specific
APIs to recursively walk and kill the process tree:

**macOS**: Uses `proc_listpids()` with `PROC_PPID_ONLY` flag
- Initially tried `proc_listchildpids()` but it returned malformed data (only 1
  byte even when children existed)
- `proc_listpids()` with parent PID filtering is more reliable

**Linux**: Tries `/proc/{pid}/task/{tid}/children` first (O(1) lookup)
- Falls back to scanning `/proc` for older kernels without this interface
- Much more efficient than scanning all processes

**Windows**: No-op, as Job Objects already handle child process cleanup

The autokill is triggered in both:
1. `Global.exit()` - catches normal exits
2. `VirtualMachine.onExit()` - catches early exits before cleanup hooks

## Known Limitations

1. **Race conditions**: Processes may spawn new children while we're iterating.
   Should consider using SIGSTOP to freeze the process tree before killing.

2. **Linux namespaces**: Should investigate using PID namespaces with
   CLONE_NEWPID for automatic kernel-level cleanup.

3. **Signal handling**: Currently sends SIGKILL immediately. Should consider
   SIGTERM first for graceful shutdown.

4. **Platform support**: Only macOS and Linux implemented. Windows relies on
   existing Job Object behavior.

## Tests

Added comprehensive test suite covering:
- Single and multiple child processes
- Nested process trees (grandchildren)
- Abnormal exits (crashes)
- Verification that processes aren't killed without the flag
- Platform-specific process types

All tests passing on macOS.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-10 04:55:19 -07:00
Alistair Smith
be5c69df79 fix: main is not readonly in @types/node (#21612)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-08-04 13:07:42 -07:00
Jarred Sumner
4494353abf Split up some of sys.zig into more files (#21603) 2025-08-04 07:02:06 -07:00