Files
bun.sh/test/js/web/workers/structured-clone.test.ts
robobun e7672b2d04 Add string fast path for postMessage and structuredClone (#21926)
## Summary

Implements a string fast path optimization for `postMessage` and
`structuredClone` operations that provides significant performance
improvements for string-only data transfer, along with various bug fixes
and infrastructure improvements.

## Key Performance Improvements

**postMessage with Workers:**
- **Small strings (11 chars):** ~5% faster (572ns vs 599ns)
- **Medium strings (14KB):** **~2.7x faster** (528ns vs 1.40μs) 
- **Large strings (3MB):** **~660x faster** (540ns vs 356μs)

**Compared to Node.js postMessage:**
- Similar performance for small strings
- Competitive for medium strings  
- **~455x faster** for large strings (540ns vs 245μs)

## Implementation Details

The optimization adds a **string fast path** that bypasses full
structured cloning serialization when:
- Input is a pure string (`value.isString()`)
- No transfer list or message ports are involved
- Not being stored persistently

### Core Changes

**String Thread-Safety Utilities (`BunString.cpp/h`):**
- `isCrossThreadShareable()` - Checks if string can be safely shared
across threads
- `toCrossThreadShareable()` - Converts strings to thread-safe form via
`isolatedCopy()`
- Handles edge cases: atoms, symbols, substring slices, external buffers

**Serialization Fast Path (`SerializedScriptValue.cpp`):**
- New `m_fastPathString` field stores string data directly
- Bypasses full object serialization machinery for pure strings
- Creates isolated copies for cross-thread safety

**Deserialization Fast Path:**
- Directly returns JSString from stored string data
- Avoids parsing serialized byte streams

**Updated Flags System (`JSValue.zig`, `Serialization.cpp`):**
- Replaces boolean `forTransfer` with structured `SerializedFlags`
- Supports `forCrossProcessTransfer` and `forStorage` distinctions

**Structured Clone Infrastructure:**
- Moved `structuredClone` implementation to dedicated
`StructuredClone.cpp`
- Added `jsFunctionStructuredCloneAdvanced` for testing with custom
flags
- Improved class serialization compatibility checks (`isForTransfer`,
`isForStorage`)

**IPC Improvements (`ipc.zig`):**
- Fixed race conditions in `SendQueue` by deferring cleanup to next tick
- Proper fd ownership handling with `bun.take()`
- Cached IPC serialize/parse functions for better performance

**BlockList Thread Safety Fixes (`BlockList.zig`):**
- Fixed potential deadlocks by moving mutex locking inside methods
- Added atomic `estimated_size` counter to avoid lock during GC
- Corrected pointer handling in comparison functions
- Improved GC safety in `rules()` method

## Benchmark Results

```
❯ bun-21926 bench/string-postmessage.mjs  # This branch
postMessage(11 chars string)  572.24 ns/iter
postMessage(14 KB string)     527.55 ns/iter  ← ~2.7x faster
postMessage(3 MB string)      539.70 ns/iter  ← ~660x faster

❯ bun-1.2.20 bench/string-postmessage.mjs  # Previous
postMessage(11 chars string)  598.76 ns/iter
postMessage(14 KB string)       1.40 µs/iter
postMessage(3 MB string)      356.38 µs/iter

❯ node bench/string-postmessage.mjs       # Node.js comparison  
postMessage(11 chars string)  569.63 ns/iter
postMessage(14 KB string)       1.46 µs/iter
postMessage(3 MB string)      245.46 µs/iter
```

**Key insight:** The fast path achieves **constant time performance**
regardless of string size (~540ns), while traditional serialization
scales linearly with data size.

## Test Coverage

**New Tests:**
- `test/js/web/structured-clone-fastpath.test.ts` - Fast path memory
usage validation
- `test/js/web/workers/structuredClone-classes.test.ts` - Comprehensive
class serialization tests
  - Tests ArrayBuffer transferability 
  - Tests BunFile cloning with storage/transfer restrictions
  - Tests net.BlockList cloning behavior
  - Validates different serialization contexts (default, worker, window)

**Enhanced Tests:**
- `test/js/web/workers/structured-clone.test.ts` - Multi-function
testing
- Tests `structuredClone`, `jscSerializeRoundtrip`, and cross-process
serialization
  - Validates consistency across different serialization paths
- `test/js/node/cluster.test.ts` - Better error handling and debugging

**Benchmarks:**
- `bench/string-postmessage.mjs` - Worker postMessage performance
comparison
- `bench/string-fastpath.mjs` - Fast path vs traditional serialization
comparison

## Bug Fixes

**BlockList Threading Issues:**
- Fixed potential deadlocks when multiple threads access BlockList
simultaneously
- Moved mutex locks inside methods rather than holding across entire
function calls
- Added atomic size tracking for GC compatibility
- Fixed comparison function pointer handling

**IPC Race Conditions:**
- Fixed race condition where `SendQueue._onAfterIPCClosed()` could be
called on wrong thread
- Deferred cleanup operations to next tick using task queue
- Improved file descriptor ownership with proper `bun.take()` usage

**Structured Clone Compatibility:**
- Enhanced class serialization with proper transfer/storage mode
checking
- Fixed edge cases where non-transferable objects were incorrectly
handled
- Added better error reporting for unsupported clone operations

## Technical Notes

- Thread safety ensured via `String.isolatedCopy()` for cross-VM
transfers
- Memory cost calculation updated to account for string references
- Maintains full compatibility with existing structured clone semantics
- Does not affect object serialization or transfer lists
- Proper cleanup and error handling throughout IPC pipeline

---------

Co-authored-by: Jarred Sumner <jarred@jarredsumner.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Meghan Denny <meghan@bun.sh>
2025-08-20 00:25:00 -07:00

321 lines
11 KiB
TypeScript

import { deserialize, serialize } from "bun:jsc";
import { openSync } from "fs";
import { bunEnv } from "harness";
import { bunExe } from "js/bun/shell/test_builder";
import { join } from "path";
function jscSerializeRoundtrip(value: any) {
const serialized = serialize(value);
const cloned = deserialize(serialized);
return cloned;
}
function jscSerializeRoundtripCrossProcess(original: any) {
const serialized = serialize(original);
const result = Bun.spawnSync({
cmd: [
bunExe(),
"-e",
`
import {deserialize, serialize} from "bun:jsc";
const serialized = deserialize(await Bun.stdin.bytes());
const cloned = serialize(serialized);
process.stdout.write(cloned);
`,
],
env: bunEnv,
stdin: serialized,
stdout: "pipe",
stderr: "inherit",
});
return deserialize(result.stdout);
}
for (const structuredCloneFn of [structuredClone, jscSerializeRoundtrip, jscSerializeRoundtripCrossProcess]) {
describe(structuredCloneFn.name, () => {
let primitives_tests = [
{ description: "primitive undefined", value: undefined },
{ description: "primitive null", value: null },
{ description: "primitive true", value: true },
{ description: "primitive false", value: false },
{ description: "primitive string, empty string", value: "" },
{ description: "primitive string, lone high surrogate", value: "\uD800" },
{ description: "primitive string, lone low surrogate", value: "\uDC00" },
{ description: "primitive string, NUL", value: "\u0000" },
{ description: "primitive string, astral character", value: "\uDBFF\uDFFD" },
{ description: "primitive number, 0.2", value: 0.2 },
{ description: "primitive number, 0", value: 0 },
{ description: "primitive number, -0", value: -0 },
{ description: "primitive number, NaN", value: NaN },
{ description: "primitive number, Infinity", value: Infinity },
{ description: "primitive number, -Infinity", value: -Infinity },
{ description: "primitive number, 9007199254740992", value: 9007199254740992 },
{ description: "primitive number, -9007199254740992", value: -9007199254740992 },
{ description: "primitive number, 9007199254740994", value: 9007199254740994 },
{ description: "primitive number, -9007199254740994", value: -9007199254740994 },
{ description: "primitive BigInt, 0n", value: 0n },
{ description: "primitive BigInt, -0n", value: -0n },
{ description: "primitive BigInt, -9007199254740994000n", value: -9007199254740994000n },
{
description: "primitive BigInt, -9007199254740994000900719925474099400090071992547409940009007199254740994000n",
value: -9007199254740994000900719925474099400090071992547409940009007199254740994000n,
},
];
for (let { description, value } of primitives_tests) {
test(description, () => {
const cloned = structuredCloneFn(value);
expect(cloned).toBe(value);
});
}
test("Array with primitives", () => {
const input = [
undefined,
null,
true,
false,
"",
"\uD800",
"\uDC00",
"\u0000",
"\uDBFF\uDFFD",
0.2,
0,
-0,
NaN,
Infinity,
-Infinity,
9007199254740992,
-9007199254740992,
9007199254740994,
-9007199254740994,
-12n,
-0n,
0n,
];
const cloned = structuredCloneFn(input);
expect(cloned).toBeInstanceOf(Array);
expect(cloned).not.toBe(input);
expect(cloned.length).toEqual(input.length);
for (const x in input) {
expect(cloned[x]).toBe(input[x]);
}
});
test("Object with primitives", () => {
const input: any = {
undefined: undefined,
null: null,
true: true,
false: false,
empty: "",
"high surrogate": "\uD800",
"low surrogate": "\uDC00",
nul: "\u0000",
astral: "\uDBFF\uDFFD",
"0.2": 0.2,
"0": 0,
"-0": -0,
NaN: NaN,
Infinity: Infinity,
"-Infinity": -Infinity,
"9007199254740992": 9007199254740992,
"-9007199254740992": -9007199254740992,
"9007199254740994": 9007199254740994,
"-9007199254740994": -9007199254740994,
"-12n": -12n,
"-0n": -0n,
"0n": 0n,
};
const cloned = structuredCloneFn(input);
expect(cloned).toBeInstanceOf(Object);
expect(cloned).not.toBeInstanceOf(Array);
expect(cloned).not.toBe(input);
for (const x in input) {
expect(cloned[x]).toBe(input[x]);
}
});
test("map", () => {
const input = new Map();
input.set("a", 1);
input.set("b", 2);
input.set("c", 3);
const cloned = structuredCloneFn(input);
expect(cloned).toBeInstanceOf(Map);
expect(cloned).not.toBe(input);
expect(cloned.size).toEqual(input.size);
for (const [key, value] of input) {
expect(cloned.get(key)).toBe(value);
}
});
test("set", () => {
const input = new Set();
input.add("a");
input.add("b");
input.add("c");
const cloned = structuredCloneFn(input);
expect(cloned).toBeInstanceOf(Set);
expect(cloned).not.toBe(input);
expect(cloned.size).toEqual(input.size);
for (const value of input) {
expect(cloned.has(value)).toBe(true);
}
});
describe("bun blobs work", () => {
test("simple", async () => {
const blob = new Blob(["hello"], { type: "application/octet-stream" });
const cloned = structuredCloneFn(blob);
await compareBlobs(blob, cloned);
});
test("empty", async () => {
const emptyBlob = new Blob([], { type: "" });
const clonedEmpty = structuredCloneFn(emptyBlob);
await compareBlobs(emptyBlob, clonedEmpty);
});
test("empty with type", async () => {
const emptyBlob = new Blob([], { type: "application/octet-stream" });
const clonedEmpty = structuredCloneFn(emptyBlob);
await compareBlobs(emptyBlob, clonedEmpty);
});
test("unknown type", async () => {
const blob = new Blob(["hello type"], { type: "this is type" });
const cloned = structuredCloneFn(blob);
await compareBlobs(blob, cloned);
});
test("file from path", async () => {
const blob = Bun.file(join(import.meta.dir, "example.txt"));
const cloned = structuredCloneFn(blob);
expect(cloned.lastModified).toBe(blob.lastModified);
expect(cloned.name).toBe(blob.name);
expect(cloned.size).toBe(blob.size);
});
test("file from fd", async () => {
const fd = openSync(join(import.meta.dir, "example.txt"), "r");
const blob = Bun.file(fd);
const cloned = structuredCloneFn(blob);
expect(cloned.lastModified).toBe(blob.lastModified);
expect(cloned.name).toBe(blob.name);
expect(cloned.size).toBe(blob.size);
});
describe("dom file", async () => {
test("without lastModified", async () => {
const file = new File(["hi"], "example.txt", { type: "text/plain" });
expect(file.lastModified).toBeGreaterThan(0);
expect(file.name).toBe("example.txt");
expect(file.size).toBe(2);
const cloned = structuredCloneFn(file);
expect(cloned.lastModified).toBe(file.lastModified);
expect(cloned.name).toBe(file.name);
expect(cloned.size).toBe(file.size);
});
test("with lastModified", async () => {
const file = new File(["hi"], "example.txt", { type: "text/plain", lastModified: 123 });
expect(file.lastModified).toBe(123);
expect(file.name).toBe("example.txt");
expect(file.size).toBe(2);
const cloned = structuredCloneFn(file);
expect(cloned.lastModified).toBe(123);
expect(cloned.name).toBe(file.name);
expect(cloned.size).toBe(file.size);
});
});
test("unpaired high surrogate (invalid utf-8)", async () => {
const blob = createBlob(encode_cesu8([0xd800]));
const cloned = structuredCloneFn(blob);
await compareBlobs(blob, cloned);
});
test("unpaired low surrogate (invalid utf-8)", async () => {
const blob = createBlob(encode_cesu8([0xdc00]));
const cloned = structuredCloneFn(blob);
await compareBlobs(blob, cloned);
});
test("paired surrogates (invalid utf-8)", async () => {
const blob = createBlob(encode_cesu8([0xd800, 0xdc00]));
const cloned = structuredCloneFn(blob);
await compareBlobs(blob, cloned);
});
});
if (structuredCloneFn === structuredClone) {
describe("net.BlockList works", () => {
test("simple", () => {
const net = require("node:net");
const blocklist = new net.BlockList();
blocklist.addAddress("123.123.123.123");
const newlist = structuredCloneFn(blocklist);
expect(newlist.check("123.123.123.123")).toBeTrue();
expect(!newlist.check("123.123.123.124")).toBeTrue();
newlist.addAddress("123.123.123.124");
expect(blocklist.check("123.123.123.124")).toBeTrue();
expect(newlist.check("123.123.123.124")).toBeTrue();
});
});
describe("transferables", () => {
test("ArrayBuffer", () => {
const buffer = Uint8Array.from([1]).buffer;
const cloned = structuredCloneFn(buffer, { transfer: [buffer] });
expect(buffer.byteLength).toBe(0);
expect(cloned.byteLength).toBe(1);
});
test("A detached ArrayBuffer cannot be transferred", () => {
const buffer = new ArrayBuffer(2);
structuredCloneFn(buffer, { transfer: [buffer] });
expect(() => {
structuredCloneFn(buffer, { transfer: [buffer] });
}).toThrow(DOMException);
});
test("Transferring a non-transferable platform object fails", () => {
const blob = new Blob();
expect(() => {
structuredCloneFn(blob, { transfer: [blob] });
}).toThrow(DOMException);
});
});
}
});
}
async function compareBlobs(original: Blob, cloned: Blob) {
expect(cloned).toBeInstanceOf(Blob);
expect(cloned).not.toBe(original);
expect(cloned.size).toBe(original.size);
expect(cloned.type).toBe(original.type);
const ab1 = await new Response(cloned).arrayBuffer();
const ab2 = await new Response(original).arrayBuffer();
expect(ab1.byteLength).toBe(ab2.byteLength);
const ta1 = new Uint8Array(ab1);
const ta2 = new Uint8Array(ab2);
for (let i = 0; i < ta1.length; i++) {
expect(ta1[i]).toBe(ta2[i]);
}
}
function encode_cesu8(codeunits: number[]): number[] {
// http://www.unicode.org/reports/tr26/ section 2.2
// only the 3-byte form is supported
const rv: number[] = [];
codeunits.forEach(function (codeunit) {
rv.push(b("11100000") + ((codeunit & b("1111000000000000")) >> 12));
rv.push(b("10000000") + ((codeunit & b("0000111111000000")) >> 6));
rv.push(b("10000000") + (codeunit & b("0000000000111111")));
});
return rv;
}
function b(s: string): number {
return parseInt(s, 2);
}
function createBlob(arr: number[]): Blob {
const buffer = new ArrayBuffer(arr.length);
const view = new DataView(buffer);
for (let i = 0; i < arr.length; i++) {
view.setUint8(i, arr[i]);
}
return new Blob([view]);
}