Commit Graph

4 Commits

Author SHA1 Message Date
SUZUKI Sosuke
fa78d2b408 perf(structuredClone): add fast path for arrays containing simple objects (#26818)
## Summary

Adds a **DenseArray fast path** for `structuredClone` / `postMessage`
that completely skips byte-buffer serialization when an
`ArrayWithContiguous` array contains **flat objects** (whose property
values are only primitives or strings).

This builds on #26814 which added fast paths for Int32/Double/Contiguous
arrays of primitives and strings. The main remaining slow case was
**arrays of objects** — the most common real-world pattern (e.g.
`[{name: "Alice", age: 30}, {name: "Bob", age: 25}]`). Previously, these
fell back to the full serialization path because the Contiguous fast
path rejected non-string cell elements.

## How it works

### Serialization
The existing Contiguous array handler is extended to recognize object
elements that pass `isObjectFastPathCandidate` (FinalObject, no
getters/setters, no indexed properties, all enumerable). For qualifying
objects, properties are collected into a `SimpleCloneableObject` struct
(reusing the existing `SimpleInMemoryPropertyTableEntry` type). The
result is stored as a `FixedVector<DenseArrayElement>` where
`DenseArrayElement = std::variant<JSValue, String,
SimpleCloneableObject>`.

If no object elements are found, the existing `SimpleArray` path is used
(no regression).

### Deserialization
A **Structure cache** avoids repeated Structure transitions when the
array contains many same-shape objects (the common case). The first
object is built via `constructEmptyObject` + `putDirect`, and its final
Structure + Identifiers are cached. Subsequent objects with matching
property names are created directly with `JSFinalObject::create(vm,
cachedStructure)`, skipping all transitions.

Safety guards:
- Cache is only used when property count AND all property names match
- Cache is disabled when `outOfLineCapacity() > 0` (properties exceed
`maxInlineCapacity`), since `JSFinalObject::create` cannot allocate a
butterfly

### Fallback conditions

| Condition | Behavior |
|-----------|----------|
| Elements are only primitives/strings | SimpleArray (existing) |
| Elements include `isObjectFastPathCandidate` objects | **DenseArray
(NEW)** |
| Object property value is an object/array | Fallback to normal path |
| Elements include Date, RegExp, Map, Set, ArrayBuffer, etc. | Fallback
to normal path |
| Array has holes | Fallback to normal path |

## Benchmarks

Apple M4 Max, release build vs system Bun v1.3.8 and Node.js v24.12:

| Benchmark | Node.js v24.12 | Bun v1.3.8 | **This PR** | vs Bun | vs
Node |

|-----------|---------------|------------|-------------|--------|---------|
| `[10 objects]` | 2.83 µs | 2.72 µs | **1.56 µs** | **1.7x** | **1.8x**
|
| `[100 objects]` | 24.51 µs | 25.98 µs | **14.11 µs** | **1.8x** |
**1.7x** |

## Test coverage

28 new edge-case tests covering:
- **Property value variants**: empty objects, special numbers (NaN,
Infinity, -0), null/undefined values, empty string keys, boolean-only
values, numeric string keys
- **Structure cache correctness**: alternating shapes, objects
interleaved with primitives, >maxInlineCapacity properties (100+), 1000
same-shape objects (stress test), repeated clone independence
- **Fallback correctness**: array property values, nested objects,
Date/RegExp/Map/Set/ArrayBuffer elements, getters, non-enumerable
properties, `Object.create(null)`, class instances
- **Frozen/sealed**: clones are mutable regardless of source
- **postMessage via MessageChannel**: mixed arrays with objects, empty
object arrays

## Changed files

- `src/bun.js/bindings/webcore/SerializedScriptValue.h` —
`SimpleCloneableObject`, `DenseArrayElement`, `FastPath::DenseArray`,
factory/constructor/member
- `src/bun.js/bindings/webcore/SerializedScriptValue.cpp` — serialize,
deserialize, `computeMemoryCost`
- `test/js/web/structured-clone-fastpath.test.ts` — 28 new tests
- `bench/snippets/structuredClone.mjs` — object array benchmarks

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2026-02-09 21:54:41 -08:00
SUZUKI Sosuke
0f43ea9bec perf(structuredClone): add fast path for root-level dense arrays (#26814)
## Summary

Add a fast path for `structuredClone` and `postMessage` when the root
value is a dense array of primitives or strings. This bypasses the full
`CloneSerializer`/`CloneDeserializer` machinery by keeping data in
native C++ structures instead of serializing to a byte stream.

**Important:** This optimization only applies when the root value passed
to `structuredClone()` / `postMessage()` is an array. Nested arrays
within objects still go through the normal serialization path.

## Implementation

Three tiers of array fast paths, checked in order:

| Tier | Indexing Type | Strategy | Applies When |
|------|--------------|----------|--------------|
| **Tier 1** | `ArrayWithInt32` | `memcpy` butterfly data | Dense int32
array, no holes, no named properties |
| **Tier 2** | `ArrayWithDouble` | `memcpy` butterfly data | Dense
double array, no holes, no named properties |
| **Tier 3** | `ArrayWithContiguous` | Copy elements into
`FixedVector<variant<JSValue, String>>` | Dense array of
primitives/strings, no holes, no named properties |

All tiers fall through to the normal serialization path when:
- The array has holes that must forward to the prototype
- The array has named properties (e.g., `arr.foo = "bar"`) — checked via
`structure->maxOffset() != invalidOffset`
- Elements contain non-primitive, non-string values (objects, arrays,
etc.)
- The context requires wire-format serialization (storage, cross-process
transfer)

### Deserialization

- **Tier 1/2:** Allocate a new `Butterfly` via `vm.auxiliarySpace()`,
`memcpy` data back, create array with `JSArray::createWithButterfly()`.
Falls back to normal deserialization if `isHavingABadTime` (forced
ArrayStorage mode).
- **Tier 3:** Pre-convert elements to `JSValue` (including `jsString()`
allocation), then use `JSArray::tryCreateUninitializedRestricted()` +
`initializeIndex()`.

## Benchmarks

Apple M4 Max, comparing system Bun 1.3.8 vs this branch (release build):

| Benchmark | Before | After | Speedup |
|-----------|--------|-------|---------|
| `structuredClone([10 numbers])` | 308.71 ns | 40.38 ns | **7.6x** |
| `structuredClone([100 numbers])` | 1.62 µs | 86.87 ns | **18.7x** |
| `structuredClone([1000 numbers])` | 13.79 µs | 544.56 ns | **25.3x** |
| `structuredClone([10 strings])` | 642.38 ns | 307.38 ns | **2.1x** |
| `structuredClone([100 strings])` | 5.67 µs | 2.57 µs | **2.2x** |
| `structuredClone([10 mixed])` | 446.32 ns | 198.35 ns | **2.3x** |
| `structuredClone(nested array)` | 1.84 µs | 1.79 µs | 1.0x (not
eligible) |
| `structuredClone({a: 123})` | 95.98 ns | 100.07 ns | 1.0x (no
regression) |

Int32 arrays see the largest gains (up to 25x) since they use a direct
`memcpy` of butterfly memory. String/mixed arrays see ~2x improvement.
No performance regression on non-eligible inputs.

## Bug Fix

Also fixes a correctness bug where arrays with named properties (e.g.,
`arr.foo = "bar"`) would lose those properties when going through the
array fast path. Added a `structure->maxOffset() != invalidOffset` guard
to fall back to normal serialization for such arrays.

Fixed a minor double-counting issue in `computeMemoryCost` where
`JSValue` elements in `SimpleArray` were counted both by `byteSize()`
and individually.

## Test Plan

38 tests in `test/js/web/structured-clone-fastpath.test.ts` covering:

- Basic array types: empty, numbers, strings, mixed primitives, special
numbers (`-0`, `NaN`, `Infinity`)
- Large arrays (10,000 elements)
- Tier 2: double arrays, Int32→Double transition
- Deep clone independence verification
- Named properties on Int32, Double, and Contiguous arrays
- `postMessage` via `MessageChannel` for Int32, Double, and mixed arrays
- Edge cases: frozen/sealed arrays, deleted elements (holes), `length`
extension, single-element arrays
- Prototype modification (custom prototype, indexed prototype properties
with holes)
- `Array` subclass identity loss (per spec)
- `undefined`-only and `null`-only arrays
- Multiple independent clones from the same source

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2026-02-08 21:36:59 -08:00
Jarred Sumner
ad1fa514ed Add fast path for simple objects in postMessage and structuredClone (#22279)
## Summary
- Extends the existing string fast path to support simple objects with
primitive values
- Achieves 2-241x performance improvements for postMessage with objects
- Maintains compatibility with existing code while significantly
reducing overhead

## Performance Results

### Bun (this PR)
```
postMessage({ prop: 11 chars string, ...9 more props }) - 648ns (was 1.36µs) 
postMessage({ prop: 14 KB string, ...9 more props })    - 719ns (was 2.09µs)
postMessage({ prop: 3 MB string, ...9 more props })      - 1.26µs (was 168µs)
```

### Node.js v24.6.0 (for comparison)
```
postMessage({ prop: 11 chars string, ...9 more props }) - 1.19µs
postMessage({ prop: 14 KB string, ...9 more props })    - 2.69µs  
postMessage({ prop: 3 MB string, ...9 more props })      - 304µs
```

## Implementation Details

The fast path activates when:
- Object is a plain object (ObjectType or FinalObjectType)
- Has no indexed properties
- All property values are primitives or strings
- No transfer list is involved

Properties are stored in a `SimpleInMemoryPropertyTableEntry` vector
that holds property names and values directly, avoiding the overhead of
full serialization.

## Test plan
- [x] Added tests for memory usage with simple objects
- [x] Added test for objects exceeding JSFinalObject::maxInlineCapacity
- [x] Created benchmark to verify performance improvements
- [x] Existing structured clone tests continue to pass

🤖 Generated with [Claude Code](https://claude.ai/code)

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-09-01 01:48:28 -07:00
robobun
e7672b2d04 Add string fast path for postMessage and structuredClone (#21926)
## Summary

Implements a string fast path optimization for `postMessage` and
`structuredClone` operations that provides significant performance
improvements for string-only data transfer, along with various bug fixes
and infrastructure improvements.

## Key Performance Improvements

**postMessage with Workers:**
- **Small strings (11 chars):** ~5% faster (572ns vs 599ns)
- **Medium strings (14KB):** **~2.7x faster** (528ns vs 1.40μs) 
- **Large strings (3MB):** **~660x faster** (540ns vs 356μs)

**Compared to Node.js postMessage:**
- Similar performance for small strings
- Competitive for medium strings  
- **~455x faster** for large strings (540ns vs 245μs)

## Implementation Details

The optimization adds a **string fast path** that bypasses full
structured cloning serialization when:
- Input is a pure string (`value.isString()`)
- No transfer list or message ports are involved
- Not being stored persistently

### Core Changes

**String Thread-Safety Utilities (`BunString.cpp/h`):**
- `isCrossThreadShareable()` - Checks if string can be safely shared
across threads
- `toCrossThreadShareable()` - Converts strings to thread-safe form via
`isolatedCopy()`
- Handles edge cases: atoms, symbols, substring slices, external buffers

**Serialization Fast Path (`SerializedScriptValue.cpp`):**
- New `m_fastPathString` field stores string data directly
- Bypasses full object serialization machinery for pure strings
- Creates isolated copies for cross-thread safety

**Deserialization Fast Path:**
- Directly returns JSString from stored string data
- Avoids parsing serialized byte streams

**Updated Flags System (`JSValue.zig`, `Serialization.cpp`):**
- Replaces boolean `forTransfer` with structured `SerializedFlags`
- Supports `forCrossProcessTransfer` and `forStorage` distinctions

**Structured Clone Infrastructure:**
- Moved `structuredClone` implementation to dedicated
`StructuredClone.cpp`
- Added `jsFunctionStructuredCloneAdvanced` for testing with custom
flags
- Improved class serialization compatibility checks (`isForTransfer`,
`isForStorage`)

**IPC Improvements (`ipc.zig`):**
- Fixed race conditions in `SendQueue` by deferring cleanup to next tick
- Proper fd ownership handling with `bun.take()`
- Cached IPC serialize/parse functions for better performance

**BlockList Thread Safety Fixes (`BlockList.zig`):**
- Fixed potential deadlocks by moving mutex locking inside methods
- Added atomic `estimated_size` counter to avoid lock during GC
- Corrected pointer handling in comparison functions
- Improved GC safety in `rules()` method

## Benchmark Results

```
❯ bun-21926 bench/string-postmessage.mjs  # This branch
postMessage(11 chars string)  572.24 ns/iter
postMessage(14 KB string)     527.55 ns/iter  ← ~2.7x faster
postMessage(3 MB string)      539.70 ns/iter  ← ~660x faster

❯ bun-1.2.20 bench/string-postmessage.mjs  # Previous
postMessage(11 chars string)  598.76 ns/iter
postMessage(14 KB string)       1.40 µs/iter
postMessage(3 MB string)      356.38 µs/iter

❯ node bench/string-postmessage.mjs       # Node.js comparison  
postMessage(11 chars string)  569.63 ns/iter
postMessage(14 KB string)       1.46 µs/iter
postMessage(3 MB string)      245.46 µs/iter
```

**Key insight:** The fast path achieves **constant time performance**
regardless of string size (~540ns), while traditional serialization
scales linearly with data size.

## Test Coverage

**New Tests:**
- `test/js/web/structured-clone-fastpath.test.ts` - Fast path memory
usage validation
- `test/js/web/workers/structuredClone-classes.test.ts` - Comprehensive
class serialization tests
  - Tests ArrayBuffer transferability 
  - Tests BunFile cloning with storage/transfer restrictions
  - Tests net.BlockList cloning behavior
  - Validates different serialization contexts (default, worker, window)

**Enhanced Tests:**
- `test/js/web/workers/structured-clone.test.ts` - Multi-function
testing
- Tests `structuredClone`, `jscSerializeRoundtrip`, and cross-process
serialization
  - Validates consistency across different serialization paths
- `test/js/node/cluster.test.ts` - Better error handling and debugging

**Benchmarks:**
- `bench/string-postmessage.mjs` - Worker postMessage performance
comparison
- `bench/string-fastpath.mjs` - Fast path vs traditional serialization
comparison

## Bug Fixes

**BlockList Threading Issues:**
- Fixed potential deadlocks when multiple threads access BlockList
simultaneously
- Moved mutex locks inside methods rather than holding across entire
function calls
- Added atomic size tracking for GC compatibility
- Fixed comparison function pointer handling

**IPC Race Conditions:**
- Fixed race condition where `SendQueue._onAfterIPCClosed()` could be
called on wrong thread
- Deferred cleanup operations to next tick using task queue
- Improved file descriptor ownership with proper `bun.take()` usage

**Structured Clone Compatibility:**
- Enhanced class serialization with proper transfer/storage mode
checking
- Fixed edge cases where non-transferable objects were incorrectly
handled
- Added better error reporting for unsupported clone operations

## Technical Notes

- Thread safety ensured via `String.isolatedCopy()` for cross-VM
transfers
- Memory cost calculation updated to account for string references
- Maintains full compatibility with existing structured clone semantics
- Does not affect object serialization or transfer lists
- Proper cleanup and error handling throughout IPC pipeline

---------

Co-authored-by: Jarred Sumner <jarred@jarredsumner.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Meghan Denny <meghan@bun.sh>
2025-08-20 00:25:00 -07:00