This PR significantly improves `Bun.stringWidth` to handle a wider
variety of Unicode characters and escape sequences correctly.
## Zero-width character handling
Added support for many previously unhandled zero-width characters:
- Soft hyphen (U+00AD)
- Word joiner and invisible operators (U+2060-U+2064)
- Lone surrogates (U+D800-U+DFFF)
- Arabic formatting characters (U+0600-U+0605, U+06DD, U+070F, U+08E2)
- Indic script combining marks (Devanagari through Malayalam)
- Thai and Lao combining marks
- Combining Diacritical Marks Extended and Supplement
- Tag characters (U+E0000-U+E007F)
## ANSI escape sequence handling
### CSI sequences
- Now properly handles ALL CSI final bytes (0x40-0x7E), not just `m`
- This means cursor movement (A/B/C/D), erase (J/K), scroll (S/T), and
other CSI commands are now correctly excluded from width calculation
### OSC sequences
- Added support for OSC sequences (ESC ] ... BEL/ST)
- OSC 8 hyperlinks are now properly handled
- Supports both BEL (0x07) and ST (ESC \) terminators
### ESC ESC fix
- Fixed state machine bug where `ESC ESC` would incorrectly reset state
- Now correctly handles consecutive ESC characters
## Emoji handling
Added proper grapheme-aware emoji width calculation:
- Flag emoji (regional indicator pairs) → width 2
- Skin tone modifiers → width 2
- ZWJ sequences (family, professions, etc.) → width 2
- Keycap sequences → width 2
- Variation selectors (VS15 for text, VS16 for emoji presentation)
- Uses ICU's `UCHAR_EMOJI` property for accurate emoji detection
## Test coverage
Added comprehensive test suite with **94 tests** covering:
- All zero-width character categories
- All CSI final bytes
- OSC sequences with various terminators
- Emoji edge cases (flags, skin tones, ZWJ, keycaps, variation
selectors)
- East Asian width (CJK, fullwidth, halfwidth katakana)
- Indic and Thai script combining marks
- Fuzzer-like stress tests for robustness
## Breaking changes
This is a behavior change - `stringWidth` will return different values
for some inputs. However, the new values are more accurate
representations of terminal display width:
| Input | Old | New | Why |
|-------|-----|-----|-----|
| Flag emoji 🇺🇸 | 1 | 2 | Flags display as 2 cells |
| Skin tone 👋🏽 | 4 | 2 | Emoji + modifier = 1 grapheme |
| ZWJ family 👨👩👧 | 8 | 2 | ZWJ sequence = 1 grapheme |
| Word joiner U+2060 | 1 | 0 | Invisible character |
| OSC 8 hyperlinks | counted URL | just visible text | URLs are
invisible |
| Cursor movement ESC[5A | counted | 0 | Control sequence |
🤖 Generated with [Claude Code](https://claude.ai/code)
---------
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Claude Bot <claude-bot@bun.sh>
`bun.String.toOwnedSliceReturningAllASCII` is supposed to return a
boolean indicating whether or not the string is entirely composed of
ASCII characters. However, the current implementation frequently
produces incorrect results:
* If the string is a `ZigString`, it always returns true, even though
`ZigString`s can be UTF-16 or Latin-1.
* If the string is a `StaticZigString`, it always returns false, even
though `StaticZigStrings` can be all ASCII.
* If the string is a 16-bit `WTFStringImpl`, it always returns false,
even though 16-bit `WTFString`s can be all ASCII.
* If the string is empty, it always returns false, even though empty
strings are valid ASCII strings.
`toOwnedSliceReturningAllASCII` is currently used in two places, both of
which assume its answer is accurate:
* `bun.webcore.Blob.fromJSWithoutDeferGC`
* `bun.api.ServerConfig.fromJS`
(For internal tracking: fixes ENG-21249)
### What does this PR do?
Adds support for `publicHoistPattern` in `bunfig.toml` and
`public-hoist-pattern` from `.npmrc`. This setting allows you to select
transitive packages to hoist to the root node_modules making them
available for all workspace packages.
```toml
[install]
# can be a string
publicHoistPattern = "@types*"
# or an array
publicHoistPattern = [ "@types*", "*eslint*" ]
```
`publicHoistPattern` only affects the isolated linker.
---
Adds `hoistPattern`. `hoistPattern` is the same as `publicHoistPattern`,
but applies to the `node_modules/.bun/node_modules` directory instead of
the root node_modules. Also the default value of `hoistPattern` is `*`
(everything is hoisted to `node_modules/.bun/node_modules` by default).
---
Fixes a determinism issue constructing the
`node_modules/.bun/node_modules` directory.
---
closes#23481closes#6160closes#23548
### How did you verify your code works?
Added tests for
- [x] only include patterns
- [x] only exclude patterns
- [x] mix of include and exclude
- [x] errors for unexpected expression types
- [x] excluding direct dependency (should still include)
- [x] match all with `*`
- [x] string and array expression types
---------
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
### What does this PR do?
Fixes `bun -p "process.stderr.write('Hello' +
String.fromCharCode(0xd800))"`.
Also fixes potential index out of bounds if there are many invalid
sequences.
This also affects `TextEncoder`.
### How did you verify your code works?
Added tests for edgecases
---------
Co-authored-by: Jarred Sumner <jarred@jarredsumner.com>
Add a new generator for JS → Zig bindings. The bulk of the conversion is
done in C++, after which the data is transformed into an FFI-safe
representation, passed to Zig, and then finally transformed into
idiomatic Zig types.
In its current form, the new bindings generator supports:
* Signed and unsigned integers
* Floats (plus a “finite” variant that disallows NaN and infinities)
* Strings
* ArrayBuffer (accepts ArrayBuffer, TypedArray, or DataView)
* Blob
* Optional types
* Nullable types (allows null, whereas Optional only allows undefined)
* Arrays
* User-defined string enumerations
* User-defined unions (fields can optionally be named to provide a
better experience in Zig)
* Null and undefined, for use in unions (can more efficiently represent
optional/nullable unions than wrapping a union in an optional)
* User-defined dictionaries (arbitrary key-value pairs; expects a JS
object and parses it into a struct)
* Default values for dictionary members
* Alternative names for dictionary members (e.g., to support both
`serverName` and `servername` without taking up twice the space)
* Descriptive error messages
* Automatic `fromJS` functions in Zig for dictionaries
* Automatic `deinit` functions for the generated Zig types
Although this bindings generator has many features not present in
`bindgen.ts`, it does not yet implement all of `bindgen.ts`'s
functionality, so for the time being, it has been named `bindgenv2`, and
its configuration is specified in `.bindv2.ts` files. Once all
`bindgen.ts`'s functionality has been incorporated, it will be renamed.
This PR ports `SSLConfig` to use the new bindings generator; see
`SSLConfig.bindv2.ts`.
(For internal tracking: fixes STAB-1319, STAB-1322, STAB-1323,
STAB-1324)
---------
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Alistair Smith <hi@alistair.sh>
### What does this PR do?
Fixes a crash related to the dev server overwriting the uws user context
pointer when setting abort callback.
Adds support for `return new Response(<jsx />, { ... })` and `return
Response.render(...)` and `return Response.redirect(...)`:
- Created a `SSRResponse` class to handle this (see
`JSBakeResponse.{h,cpp}`)
- `SSRResponse` is designed to "fake" being a React component
- This is done in JSBakeResponse::create inside of
src/bun.js/bindings/JSBakeResponse.cpp
- And `src/js/builtins/BakeSSRResponse.ts` defines a `wrapComponent`
function which wraps
the passed in component (when doing `new Response(<jsx />, ...)`). It
does
this to throw an error (in redirect()/render() case) or return the
component.
- Created a `BakeAdditionsToGlobal` struct which contains some
properties
needed for this
- Added some of the properties we need to fake to BunBuiltinNames.h
(e.g.
`$$typeof`), the rationale behind this is that we couldn't use
`structure->addPropertyTransition` because JSBakeResponse is not a final
JSObject.
- When bake and server-side, bundler rewrites `Response ->
Bun.SSRResponse` (see `src/ast/P.zig` and `src/ast/visitExpr.zig`)
- Created a new WebCore body variant (`Render: struct { path: []const u8
}`)
- Created when `return Response.render(...)`
- When handled, it re-invokes dev server to render the new path
Enables server-side sourcemaps for the dev server:
- New source providers for server-side:
(`DevServerSourceProvider.{h,cpp}`)
- IncrementalGraph and SourceMapStore are updated to support this
There are numerous other stuff:
- allow `app` configuration from Bun.serve(...)
- fix errors stopping dev server
- fix use after free related to in
RequestContext.finishRunningErrorHandler
- Request.cookies
- Make `"use client";` components work
- Fix some bugs using `require(...)` in dev server
- Fix catch-all routes not working in the dev server
- Updates `findSourceMappingURL(...)` to use `std.mem.lastIndexOf(...)`
because
the sourcemap that should be used is the last one anyway
---------
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Claude Bot <claude-bot@bun.sh>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Alistair Smith <hi@alistair.sh>
Add `bun.ptr.ExternalShared`, a shared pointer whose reference count is
managed externally; e.g., by extern functions. This can be used to work
with `RefCounted` C++ objects in Zig. For example:
```cpp
// C++:
struct MyType : RefCounted<MyType> { ... };
extern "C" void MyType__ref(MyType* self) { self->ref(); }
extern "C" void MyType__ref(MyType* self) { self->deref(); }
```
```zig
// Zig:
const MyType = opaque {
extern fn MyType__ref(self: *MyType) void;
extern fn MyType__deref(self: *MyType) void;
pub const Ref = bun.ptr.ExternalShared(MyType);
// This enables `ExternalShared` to work.
pub const external_shared_descriptor = struct {
pub const ref = MyType__ref;
pub const deref = MyType__deref;
};
};
// Now `MyType.Ref` behaves just like `Ref<MyType>` in C++:
var some_ref: MyType.Ref = someFunctionReturningMyTypeRef();
const ptr: *MyType = some_ref.get(); // gets the inner pointer
var some_other_ref = some_ref.clone(); // increments the ref count
some_ref.deinit(); // decrements the ref count
// decrements the ref count again; if no other refs exist, the object
// is destroyed
some_other_ref.deinit();
```
This commit also adds `RawRefCount`, a simple wrapper around an integer
reference count that can be used to implement the interface required by
`ExternalShared`. Generally, for reference-counted Zig types,
`bun.ptr.Shared` is preferred, but occasionally it is useful to have an
“intrusive” reference-counted type where the ref count is stored in the
type itself. For this purpose, `ExternalShared` + `RawRefCount` is more
flexible and less error-prone than the deprecated `bun.ptr.RefCounted`
type.
(For internal tracking: fixes STAB-1287, STAB-1288)
### What does this PR do?
**This PR is created because [the previous PR I
opened](https://github.com/oven-sh/bun/pull/21728) had some concerning
issues.** Thanks @Jarred-Sumner for the help.
The goal of this PR is to introduce PUB/SUB functionality to the
built-in Redis client. Based on the fact that the current Redis API does
not appear to have compatibility with `io-redis` or `redis-node`, I've
decided to do away with existing APIs and API compatibility with these
existing libraries.
I have decided to base my implementation on the [`redis-node` pub/sub
API](https://github.com/redis/node-redis/blob/master/docs/pub-sub.md).
#### Random Things That Happened
- [x] Refactored the build scripts so that `valgrind` can be disabled.
- [x] Added a `numeric` namespace in `harness.ts` with useful
mathematical libraries.
- [x] Added a mechanism in `cppbind.ts` to disable static assertions
(specifically to allow `check_slow` even when returning a `JSValue`).
Implemented via `// NOLINT[NEXTLINE]?\(.*\)` macros.
- [x] Fixed inconsistencies in error handling of `JSMap`.
### How did you verify your code works?
I've written a set of unit tests to hopefully catch the major use-cases
of this feature. They all appear to pass.
#### Future Improvements
I would have a lot more confidence in our Redis implementation if we
tested it with a test suite running over a network which emulates a high
network failure rate. There are large amounts of edge cases that are
worthwhile to grab, but I think we can roll that out in a future PR.
### Future Tasks
- [ ] Tests over flaky network
- [ ] Use the custom private members over `_<member>`.
---------
Co-authored-by: Jarred Sumner <jarred@jarredsumner.com>
### What does this PR do?
This assertion is occasionally incorrect, and was originally added as a
workaround for lack of proper error handling in zig's std library. We've
seen fixed that so this assertion is no longer needed.
### How did you verify your code works?
---------
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
### What does this PR do?
The goal of this PR is to introduce PUB/SUB functionality to the
built-in Redis client. Based on the fact that the current Redis API does
not appear to have compatibility with `io-redis` or `redis-node`, I've
decided to do away with existing APIs and API compatibility with these
existing libraries.
I have decided to base my implementation on the [`redis-node` pub/sub
API](https://github.com/redis/node-redis/blob/master/docs/pub-sub.md).
### How did you verify your code works?
I've written a set of unit tests to hopefully catch the major use-cases
of this feature. They all appear to pass:
<img width="368" height="71" alt="image"
src="https://github.com/user-attachments/assets/36527386-c8fe-47f6-b69a-a11d4b614fa0"
/>
#### Future Improvements
I would have a lot more confidence in our Redis implementation if we
tested it with a test suite running over a network which emulates a high
network failure rate. There are large amounts of edge cases that are
worthwhile to grab, but I think we can roll that out in a future PR.
### Future Tasks
- [ ] Tests over flaky network
- [ ] Use the custom private members over `_<member>`.
---------
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Dylan Conway <dylan.conway567@gmail.com>
Co-authored-by: Alistair Smith <hi@alistair.sh>
* Define a generic allocator interface to enable static polymorphism for
allocators (see `GenericAllocator` in `src/allocators.zig`). Note that
`std.mem.Allocator` itself is considered a generic allocator.
* Add utilities to `bun.allocators` for working with generic allocators.
* Add a new namespace, `bun.memory`, with basic utilities for working
with memory and objects (`create`, `destroy`, `initDefault`, `deinit`).
* Add `bun.DefaultAllocator`, a zero-sized generic allocator type whose
`allocator` method simply returns `bun.default_allocator`.
* Implement the generic allocator interface in `AllocationScope` and
`MimallocArena`.
* Improve `bun.threading.GuardedValue` (now `bun.threading.Guarded`).
* Improve `bun.safety.AllocPtr` (now `bun.safety.CheckedAllocator`).
(For internal tracking: fixes STAB-1085, STAB-1086, STAB-1087,
STAB-1088, STAB-1089, STAB-1090, STAB-1091)
### What does this PR do?
- Instead of storing `len` in `BoundedArray` as a `usize`, store it as
either a `u8` or ` u16` depending on the `buffer_capacity`
- Copy-paste `BoundedArray` from the standard library into Bun's
codebase as it was removed in
https://github.com/ziglang/zig/pull/24699/files#diff-cbd8cbbc17583cb9ea5cc0f711ce0ad447b446e62ea5ddbe29274696dce89e4f
and we will probably continue using it
### How did you verify your code works?
Ran `bun run zig:check`
---------
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: taylor.fish <contact@taylor.fish>
in JS, `new TextDecoder("latin1").decode(...)` uses cp1252. In python,
latin1 is half-width utf-16. In our code, latin1 typically refers to
half-width utf-16 because JavaScriptCore uses that for most strings, but
sometimes it refers to cp1252. Rename the cp1252 functions to be called
cp1252
Also fixes an issue where Buffer.from with utf-16le would sometimes
output the wrong value:
```js
$> bun -p "Buffer.from('\x80', 'utf-16le')"
<Buffer ac 20>
$> node -p "Buffer.from('\x80', 'utf-16le')"
<Buffer 80 00>
$> bun-debug -p "Buffer.from('\x80', 'utf-16le')"
<Buffer 80 00>
```
Replace `catch bun.outOfMemory()`, which can accidentally catch
non-OOM-related errors, with either `bun.handleOom` or a manual `catch
|err| switch (err)`.
(For internal tracking: fixes STAB-1070)
---------
Co-authored-by: Dylan Conway <dylan.conway567@gmail.com>
### What does this PR do?
This PR adds builtin YAML parsing with `Bun.YAML.parse`
```js
import { YAML } from "bun";
const items = YAML.parse("- item1");
console.log(items); // [ "item1" ]
```
Also YAML imports work just like JSON and TOML imports
```js
import pkg from "./package.yaml"
console.log({ pkg }); // { pkg: { name: "pkg", version: "1.1.1" } }
```
### How did you verify your code works?
Added some tests for YAML imports and parsed values.
---------
Co-authored-by: Claude Bot <claude-bot@bun.sh>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Jarred Sumner <jarred@jarredsumner.com>
* `IncrementalGraph(.client).File` packs its fields in a specific way to
save space, but it makes the struct hard to use and error-prone (e.g.,
untagged unions with tags stored in a separate `flags` struct). This PR
changes `File` to have a human-readable layout, but adds methods to
convert it to and from `File.Packed`, a packed version with the same
space efficiency as before.
* Reduce the need to pass the dev allocator to functions (e.g.,
`deinit`) by storing it as a struct field via the new `DevAllocator`
type. This type has no overhead in release builds, or when
`AllocationScope` is disabled.
* Use owned pointers in `PackedMap`.
* Use `bun.ptr.Shared` for `PackedMap` instead of the old
`bun.ptr.RefPtr`.
* Add `bun.ptr.ScopedOwned`, which is like `bun.ptr.Owned`, but can
store an `AllocationScope`. No overhead in release builds or when
`AllocationScope` is disabled.
* Reduce redundant allocators in `BundleV2`.
* Add owned pointer conversions to `MutableString`.
* Make `AllocationScope` behave like a pointer, so it can be moved
without invalidating allocations. This eliminates the need for
self-references.
* Change memory cost algorithm so it doesn't rely on “dedupe bits”.
These bits used to take advantage of padding but there is now no padding
in `PackedMap`.
* Replace `VoidFieldTypes` with `useAllFields`; this eliminates the need
for `voidFieldTypesDiscardHelper`.
(For internal tracking: fixes STAB-1035, STAB-1036, STAB-1037,
STAB-1038, STAB-1039, STAB-1040, STAB-1041, STAB-1042, STAB-1043,
STAB-1044, STAB-1045)
---------
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Jarred Sumner <jarred@jarredsumner.com>
Co-authored-by: Claude Bot <claude-bot@bun.sh>
Co-authored-by: Claude <noreply@anthropic.com>
### What does this PR do?
- for these kinds of aborts which we test in CI, introduce a feature
flag to suppress core dumps and crash reporting only from that abort,
and set the flag when running the test:
- libuv stub functions
- Node-API abort (used in particular when calling illegal functions
during finalizers)
- passing `process.kill` its own PID
- core dumps are suppressed with `setrlimit`, and crash reporting with
the new `suppress_reporting` field. these suppressions are only engaged
right before crashing, so we won't ignore new kinds of crashes that come
up in these tests.
- for the test bindings used to test the crash handler in
`run-crash-handler.test.ts`, disables core dumps but does not disable
crash reporting (because crashes get reported to a server that the test
is running to make sure they are reported)
- fixes a panic when printing source code around an error containing
`\n\r`
- updates the code where we clone vendor tests to checkout the right tag
- adds `vendor/elysia/test/path/plugin.test.ts` to
no-validate-exceptions
- this failure was exposed by starting to test the version of elysia we
have been intending to test. the crash trace suggests it may be fixed by
#21307.
- makes dumping core or uploading a crash report count as a failing test
- this ensures we don't realize a crash has occurred if it happened in a
subprocess and the main test doesn't adequately check the exit code. to
spawn a subprocess you expect to fail, prefer `expect(code).toBe(1)`
over `expect(code).not.toBe(0)`. if you really expect multiple possible
erroneous exit codes, you might try `expect(signal).toBeNull()` to still
disallow crashes.
### How did you verify your code works?
Running affected tests on a Linux machine with core dumps set up and
checking no new ones appear.
https://buildkite.com/bun/bun/builds/21465 has no core dumps.
### What does this PR do?
Lets us write and run unit tests directly in Zig.
Running Zig unit tests in CI is blocked by https://github.com/ziglang/zig/issues/23281. We can un-comment relevant code once this is fixed.
#### Workflow
> I'll finish writing this up later, but some initial points are below.
> Tl;Dr: `bun build:test`
Test binaries can be made for any kind of build. They are called `<bun>-test` and live next to their corresponding `bun` bin. For example, debug tests compile to `build/debug/bun-debug-test`.
Test binaries re-use most cmake/zig build steps from normal bun binaries, so building one after a normal bun build is pretty fast.
### How did you verify your code works?
I tested that my tests run tests.