This PR consolidates all Docker-based test services into a centralized docker-compose
setup, eliminating per-test container spawning that was causing CI flakiness and
increased costs.
## Problem
- Tests were spawning individual Docker containers for each test suite
- Container startup race conditions caused intermittent test failures
- Docker resource exhaustion ("all predefined address pools have been fully subnetted")
- Redundant container creation increased CI runtime and costs
- No proper health checking led to "Connection refused" errors
## Solution
Implemented a unified docker-compose infrastructure with:
- Centralized service definitions for all test databases and services
- Proper health checks that block until services are ready
- Dynamic port allocation (except Autobahn which requires fixed port 9002)
- Reusable containers across test runs
- Smart health check configuration (only runs during startup, not continuously)
## Changes
### New Docker Infrastructure (`test/docker/`)
- `docker-compose.yml`: Defines all test services with health checks
- `index.ts`: TypeScript helper for programmatic service management
- `config/`: Service configuration files (PostgreSQL auth, Autobahn config)
- `init-scripts/`: Database initialization scripts
- CI script for pre-pulling images to warm Docker cache
### Service Configurations
- **PostgreSQL** (3 variants): plain, TLS, auth
- Health check: `pg_isready -U postgres`
- Dynamic port mapping
- Initialization scripts for test databases and users
- **MySQL** (3 variants): plain, native_password, TLS
- Health check: `mysqladmin ping`
- MySQL 8.0 for native_password (8.4 removed --default-authentication-plugin)
- Fixed user creation issue in auth tests
- **Redis/Valkey** (2 variants): plain, unified (TLS + Unix socket)
- Dynamic port mapping
- Unix socket support for local connections
- **MinIO** (S3-compatible storage)
- Automatic bucket creation
- Dynamic port mapping for both API and console
- **Autobahn** (WebSocket compliance test suite)
- Fixed port 9002 (required due to Host header validation)
- FIXME added for future WebSocket Host header customization
### Test Harness Updates
- `describeWithContainer()` signature changed from `(port: number)` to `(container: { port: number; host: string })`
- Uses `beforeAll()` to ensure container readiness
- Integrates with docker-compose helper module
- Default project name: `bun-test-services`
### Health Check Strategy
- Health checks configured with `interval: 1h` to effectively disable after startup
- `docker compose up --wait` blocks until services are healthy
- `start_period` gives services time to initialize before health checks begin
- Eliminates unreliable `sleep()` calls
## Test Results
All tests passing with new infrastructure:
- PostgreSQL: 792 tests ✅
- MySQL: 184 tests ✅
- Redis/Valkey: 304 tests ✅
- MinIO/S3: 276 tests ✅
- **Total: 1,556 tests passing**
## Performance Improvements
- Container startup: ~5-7s (once per test run) vs ~3-5s per test suite
- Eliminated redundant container creation
- Reduced Docker network allocation pressure
- Tests run faster due to container reuse
## Migration Details
### PostgreSQL Tests
- All variants working (plain, TLS, auth)
- Proper user creation with different auth methods
- TLS certificates mounted correctly
### MySQL Tests
- Fixed "Operation CREATE USER failed" by adding DROP USER IF EXISTS
- Fixed permission issues (GRANT on correct database)
- Downgraded from MySQL 8.4 to 8.0 for native_password plugin support
### S3/MinIO Tests
- Replaced direct Docker spawning with docker-compose
- Automatic bucket creation via `mc` command in container
- 276 S3 tests passing without modifications
### Known Issues
- Autobahn WebSocket tests cause Bun crash (pre-existing bug: `ASSERTION FAILED: m_pendingActivityCount > 0`)
- This is a Bun runtime issue, not related to Docker infrastructure
## Benefits
✅ Eliminates CI flakiness from container startup races
✅ Reduces CI costs through container reuse
✅ Consistent test environment across all runs
✅ Proper service readiness detection
✅ Easier local development (services persist between runs)
✅ No more Docker subnet exhaustion
## Testing
```bash
# Run individual test suites
bun bd test test/js/sql/sql-mysql.test.ts
bun bd test test/js/sql/sql-postgres.test.ts
bun bd test test/js/bun/s3/s3.test.ts
# All services use the same project
docker compose -p bun-test-services ps
```
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
### What does this PR do?
### How did you verify your code works?
---------
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
## Summary
- Implements automated Windows code signing for x64 and x64-baseline
builds
- Integrates DigiCert KeyLocker for secure certificate management
- Adds CI/CD pipeline support for signing during builds
## Changes
- Added `.buildkite/scripts/sign-windows.sh` script for automated
signing
- Updated CMake configurations to support signing workflow
- Modified build scripts to integrate signing step
## Testing
- Script tested locally with manual signing process
- Successfully signed test binaries at:
- `C:\Builds\bun-windows-x64\bun.exe`
- `C:\Builds\bun-windows-x64-baseline\bun.exe`
## References
Uses DigiCert KeyLocker tools for Windows signing
## Next Steps
- Validate Buildkite environment variables in CI
- Test full pipeline in CI environment
---------
Co-authored-by: Jarred Sumner <jarred@bun.sh>
Co-authored-by: Claude Bot <claude-bot@bun.sh>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
### What does this PR do?
### How did you verify your code works?
---------
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Various types have a `deepClone` method, but there are two different
signatures in use. Some types, like those in the `css` directory, have
an infallible `deepClone` method that cannot return an error. Others,
like those in `ast`, are fallible and can return `error.OutOfMemory`.
Historically, `BabyList.deepClone` has only worked with the fallible
kind of `deepClone`, necessitating the addition of
`BabyList.deepClone2`, which only works with the *in*fallible kind.
This PR:
* Updates `BabyList.deepClone` so that it works with both kinds of
method
* Updates `BabyList.deepClone2` so that it works with both kinds of
method
* Renames `BabyList.deepClone2` to `BabyList.deepCloneInfallible`
* Adds `bun.handleOom(...)`, which is like `... catch bun.outOfMemory()`
but it can't accidentally catch non-OOM-related errors
* Replaces an occurrence of `anyerror` with a more specific error set
(For internal tracking: fixes STAB-969, STAB-970)
---------
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
## Summary
- Replace cmake-based clang-format with dedicated bash script that
directly processes source files
- Optimize CI to only install clang-format-19 instead of entire LLVM
toolchain
- Script enforces specific clang-format version with no fallbacks
## Changes
1. **New bash script** (`scripts/run-clang-format.sh`):
- Directly reads C++ files from `CxxSources.txt`
- Finds all header files in `src/` and `packages/` directories
- Respects existing `.clang-format` configuration files
- Requires specific clang-format version (no fallbacks)
- Defaults to format mode (modifies files in place)
2. **Optimized GitHub Action**:
- Only installs `clang-format-19` package with `--no-install-recommends`
- Avoids installing unnecessary components like manpages
- Uses new bash script instead of cmake targets
3. **Updated package.json scripts**:
- `clang-format`, `clang-format:check`, and `clang-format:diff` now use
the bash script
## Test plan
- [x] Verified script finds and processes all C++ source and header
files
- [x] Tested formatting works correctly by adding formatting issues and
running the script
- [x] Confirmed script respects `.clang-format` configuration files
- [x] Script correctly requires specific clang-format version
🤖 Generated with [Claude Code](https://claude.ai/code)
---------
Co-authored-by: Claude <claude@anthropic.ai>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Splits up js_parser.zig into multiple files. Also changes visitExprInOut
to use function calls rather than switch
Not ready:
- [ ] P.zig is ~70,000 tokens, still needs to get smaller
- [x] ~~measure zig build time before & after (is it slower?)~~ no
significant impact
---------
Co-authored-by: pfgithub <6010774+pfgithub@users.noreply.github.com>
### What does this PR do?
reduce number of zombie build processes that can happen in CI or when
building locally
### How did you verify your code works?
`add` no longer locks a mutex, and `finish` no longer locks a mutex
except for the last task. This could meaningfully improve performance in
cases where we spawn a large number of tasks on a thread pool. This
change doesn't alter the semantics of the type, unlike the standard
library's `WaitGroup`, which also uses atomics but has to be explicitly
reset.
(For internal tracking: fixes ENG-19722)
---------
Co-authored-by: Jarred Sumner <jarred@jarredsumner.com>
### What does this PR do?
lets you get useful info out of this script even if you set the number
of attempts too high and you don't want to wait
### How did you verify your code works?
local testing
(I cancelled CI because this script is not used anywhere in CI so it
wouldn't tell us anything useful)
### What does this PR do?
- for these kinds of aborts which we test in CI, introduce a feature
flag to suppress core dumps and crash reporting only from that abort,
and set the flag when running the test:
- libuv stub functions
- Node-API abort (used in particular when calling illegal functions
during finalizers)
- passing `process.kill` its own PID
- core dumps are suppressed with `setrlimit`, and crash reporting with
the new `suppress_reporting` field. these suppressions are only engaged
right before crashing, so we won't ignore new kinds of crashes that come
up in these tests.
- for the test bindings used to test the crash handler in
`run-crash-handler.test.ts`, disables core dumps but does not disable
crash reporting (because crashes get reported to a server that the test
is running to make sure they are reported)
- fixes a panic when printing source code around an error containing
`\n\r`
- updates the code where we clone vendor tests to checkout the right tag
- adds `vendor/elysia/test/path/plugin.test.ts` to
no-validate-exceptions
- this failure was exposed by starting to test the version of elysia we
have been intending to test. the crash trace suggests it may be fixed by
#21307.
- makes dumping core or uploading a crash report count as a failing test
- this ensures we don't realize a crash has occurred if it happened in a
subprocess and the main test doesn't adequately check the exit code. to
spawn a subprocess you expect to fail, prefer `expect(code).toBe(1)`
over `expect(code).not.toBe(0)`. if you really expect multiple possible
erroneous exit codes, you might try `expect(signal).toBeNull()` to still
disallow crashes.
### How did you verify your code works?
Running affected tests on a Linux machine with core dumps set up and
checking no new ones appear.
https://buildkite.com/bun/bun/builds/21465 has no core dumps.
I haven't checked all uses of tryTakeException but this bug is probably
not the only one.
Caught by running fuzzy-wuzzy with debug logging enabled. It tried to
print the exception. Updates fuzzy-wuzzy to have improved logging that
can tell you what was last executed before a crash.
---------
Co-authored-by: Jarred Sumner <jarred@jarredsumner.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
### What does this PR do?
Closes#13012
On Linux, when any Bun process spawned by `runner.node.mjs` crashes, we
run GDB in batch mode to print a backtrace from the core file.
And on all platforms, we run a mini `bun.report` server which collects
crashes reported by any Bun process executed during the tests, and after
each test `runner.node.mjs` fetches and prints any new crashes from the
server.
<details>
<summary>example 1</summary>
```
#0 crash_handler.crash () at crash_handler.zig:1513
#1 0x0000000002cf4020 in crash_handler.crashHandler (reason=..., error_return_trace=0x0, begin_addr=...) at crash_handler.zig:479
#2 0x0000000002cefe25 in crash_handler.handleSegfaultPosix (sig=<optimized out>, info=<optimized out>) at crash_handler.zig:800
#3 0x00000000045a1124 in WTF::jscSignalHandler (sig=11, info=0x7ffe044e30b0, ucontext=0x0) at vendor/WebKit/Source/WTF/wtf/threads/Signals.cpp:548
#4 <signal handler called>
#5 JSC::JSCell::type (this=0x0) at vendor/WebKit/Source/JavaScriptCore/runtime/JSCellInlines.h:137
#6 JSC::JSObject::getOwnNonIndexPropertySlot (this=0x150bc914fe18, vm=..., structure=0x150a0102de50, propertyName=..., slot=...) at vendor/WebKit/Source/JavaScriptCore/runtime/JSObject.h:1348
#7 JSC::JSObject::getPropertySlot<false> (this=0x150bc914fe18, globalObject=0x150b864e0088, propertyName=..., slot=...) at vendor/WebKit/Source/JavaScriptCore/runtime/JSObject.h:1433
#8 JSC::JSValue::getPropertySlot (this=0x7ffe044e4880, globalObject=0x150b864e0088, propertyName=..., slot=...) at vendor/WebKit/Source/JavaScriptCore/runtime/JSCJSValueInlines.h:1108
#9 JSC::JSValue::get (this=0x7ffe044e4880, globalObject=0x150b864e0088, propertyName=..., slot=...) at vendor/WebKit/Source/JavaScriptCore/runtime/JSCJSValueInlines.h:1065
#10 JSC::LLInt::performLLIntGetByID (bytecodeIndex=..., codeBlock=0x150b861e7740, globalObject=0x150b864e0088, baseValue=..., ident=..., metadata=...) at vendor/WebKit/Source/JavaScriptCore/llint/LLIntSlowPaths.cpp:878
#11 0x0000000004d7b055 in llint_slow_path_get_by_id (callFrame=0x7ffe044e4ab0, pc=0x150bc92ea0e7) at vendor/WebKit/Source/JavaScriptCore/llint/LLIntSlowPaths.cpp:946
#12 0x0000000003dd6042 in llint_op_get_by_id ()
#13 0x0000000000000000 in ?? ()
```
</details>
<details>
<summary>example 2</summary>
```
#0 crash_handler.crash () at crash_handler.zig:1513
#1 0x0000000002c5db80 in crash_handler.crashHandler (reason=..., error_return_trace=0x0, begin_addr=...) at crash_handler.zig:479
#2 0x0000000002c59f60 in crash_handler.handleSegfaultPosix (sig=<optimized out>, info=<optimized out>) at crash_handler.zig:800
#3 0x00000000042ecc88 in WTF::jscSignalHandler (sig=11, info=0xfffff60141b0, ucontext=0xfffff6014230) at vendor/WebKit/Source/WTF/wtf/threads/Signals.cpp:548
#4 <signal handler called>
#5 bun.js.api.FFIObject.Reader.u8 (globalObject=0x4000554e0088) at /var/lib/buildkite-agent/builds/ip-172-31-75-92/bun/bun/src/bun.js/api/FFIObject.zig:65
#6 bun.js.jsc.host_fn.toJSHostCall__anon_1711576 (globalThis=0x4000554e0088, args=...) at /var/lib/buildkite-agent/builds/ip-172-31-75-92/bun/bun/src/bun.js/jsc/host_fn.zig:97
#7 bun.js.jsc.host_fn.DOMCall("Reader"[0..6],bun.js.api.FFIObject.Reader,"u8"[0..2],.{ .reads = .{ ... }, .writes = .{ ... } }).slowpath (globalObject=0x4000554e0088, thisValue=70370172175040, arguments_ptr=0xfffff6015460, arguments_len=1) at /var/lib/buildkite-agent/builds/ip-172-31-75-92/bun/bun/src/bun.js/jsc/host_fn.zig:490
#8 0x000040003419003c in ?? ()
#9 0x0000400055173440 in ?? ()
```
</details>
I used GDB instead of LLDB (as the branch name suggests) because it
seems to produce more useful stack traces with musl libc.
- [x] on linux, use gdb to print from core dump of main bun process
crashed
- [x] on linux, use gdb to print from all new core dumps (so including
bun subprocesses spawned by the test that crashed)
- [x] on all platforms, use a mini bun.report server to print a
self-reported trace (depends on oven-sh/bun.report#15; for now our
package.json points to a commit on the branch of that repo)
- [x] fix trying to fetch stack traces too early on windows
- [x] use output groups so the traces show up alongside the log for the
specific test instead of having to find it in the logs from the entire
run
- [x] get oven-sh/bun.report#15 merged, and point to a bun.report commit
on the main branch instead of the PR branch in package.json
### How did you verify your code works?
Manually, and in CI with a crashing test.
---------
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>