Add a fast path for Blob constructor when processing contiguous arrays.
Instead of calling JSObject::getIndex() per element (which involves C++
function call overhead and property lookup), directly access the
butterfly memory when the array is a plain contiguous/int32 array with
a sane prototype chain.
Results on Apple M4 Max (release build vs system bun 1.3.6):
- 100 strings: 3.53µs → 2.24µs (1.58x faster)
- 1000 strings: 35.8µs → 21.7µs (1.65x faster)
- 10000 strings: 377µs → 220µs (1.72x faster)
- 100 mixed: 4.53µs → 3.35µs (1.35x faster)
The optimization falls back to the existing JSArrayIterator path for:
- Proxy/exotic objects
- Arrays with modified prototype chain
- Sparse arrays (ArrayStorage indexing)
- Arrays where mayInterceptIndexedAccesses() is true
Claude-Generated-By: Claude Code (cli/claude=100%)
Claude-Steers: 10
Claude-Permission-Prompts: 0
Claude-Escapes: 0
Claude-Plan:
<claude-plan>
# 配列イテレーション最適化: ContiguousArrayView
## 概要
JSC の contiguous array の butterfly メモリに直接アクセスすることで、配列の要素アクセスを最適化する。対象APIは `new Blob([...parts])` の multi-element パス。ベンチマークで効果を計測し、エッジケースのストレステストで安全性を検証する。
## 変更ファイル
| ファイル | 変更内容 |
|----------|----------|
| `src/bun.js/bindings/bindings.cpp` | C++ ヘルパー関数追加 |
| `src/bun.js/bindings/ContiguousArrayView.zig` | **新規** Zig側ビュー型 |
| `src/bun.js/jsc.zig` | 新モジュールのexport追加 |
| `src/bun.js/webcore/Blob.zig` | Blob constructor に fast path 導入 |
| `bench/snippets/blob-array-iteration.mjs` | **新規** ベンチマーク |
| `test/js/web/fetch/blob-array-fast-path.test.ts` | **新規** ストレステスト |
---
## Step 1: C++ ヘルパー関数
**ファイル**: `src/bun.js/bindings/bindings.cpp`
```cpp
extern "C" const JSC::EncodedJSValue* Bun__JSArray__getContiguousVector(
JSC::EncodedJSValue encodedValue,
JSC::JSGlobalObject* globalObject,
uint32_t* outLength)
{
JSC::JSValue value = JSC::JSValue::decode(encodedValue);
if (!value.isCell())
return nullptr;
JSC::JSCell* cell = value.asCell();
// Proxy, exotic object を排除
if (!isJSArray(cell))
return nullptr;
JSC::JSArray* array = jsCast<JSC::JSArray*>(cell);
JSC::IndexingType indexing = array->indexingType();
// hasInt32 / hasContiguous ヘルパーで判定
if (!hasInt32(indexing) && !hasContiguous(indexing))
return nullptr;
// prototype chain が健全で indexed access がインターセプトされていないか確認
if (!array->canDoFastIndexedAccess())
return nullptr;
// デバッグアサート
ASSERT(!globalObject->isHavingABadTime());
ASSERT(!array->structure()->mayInterceptIndexedAccesses());
JSC::Butterfly* butterfly = array->butterfly();
uint32_t length = butterfly->publicLength();
ASSERT(length <= butterfly->vectorLength());
if (length == 0)
return nullptr;
*outLength = length;
return reinterpret_cast<const JSC::EncodedJSValue*>(butterfly->contiguous().data());
}
```
**選択根拠**:
- `isJSArray()` — `inherits<JSArray>()` より高速で、JSCの標準パターン
- `canDoFastIndexedAccess()` — `arrayPrototypeChainIsSane()` + `mayInterceptIndexedAccesses()` + prototype検証を一括実施
- `isIteratorProtocolFastAndNonObservable()` は使わない(Symbol.iterator は関係ないため過度に制約的)
---
## Step 2: Zig ContiguousArrayView 型
**ファイル**: `src/bun.js/bindings/ContiguousArrayView.zig` (新規)
```zig
pub const ContiguousArrayView = struct {
elements: [*]const JSValue,
len: u32,
i: u32 = 0,
pub fn init(value: JSValue, global: *JSGlobalObject) ?ContiguousArrayView {
var length: u32 = 0;
const ptr = Bun__JSArray__getContiguousVector(value, global, &length);
if (ptr == null) return null;
return .{ .elements = @ptrCast(ptr.?), .len = length };
}
pub inline fn next(this: *ContiguousArrayView) ?JSValue {
if (this.i >= this.len) return null;
const val = this.elements[this.i];
this.i += 1;
if (val == .zero) return .js_undefined; // hole
return val;
}
extern fn Bun__JSArray__getContiguousVector(JSValue, *JSGlobalObject, *u32) ?[*]const JSValue;
};
```
---
## Step 3: jsc.zig に export 追加
**ファイル**: `src/bun.js/jsc.zig` (line 58 付近に追加)
```zig
pub const ContiguousArrayView = @import("./bindings/ContiguousArrayView.zig").ContiguousArrayView;
```
---
## Step 4: Blob constructor に fast path 導入
**ファイル**: `src/bun.js/webcore/Blob.zig` (line 3969 の `.Array, .DerivedArray` ブランチ)
変更前:
```zig
.Array, .DerivedArray => {
var iter = try jsc.JSArrayIterator.init(current, global);
try stack.ensureUnusedCapacity(iter.len);
var any_arrays = false;
while (try iter.next()) |item| {
...
}
},
```
変更後:
```zig
.Array, .DerivedArray => {
if (jsc.ContiguousArrayView.init(current, global)) |*fast_view| {
// Fast path: butterfly 直接アクセス
try stack.ensureUnusedCapacity(fast_view.len);
var any_arrays = false;
while (fast_view.next()) |item| {
if (item.isUndefinedOrNull()) continue;
if (!any_arrays) {
switch (item.jsTypeLoose()) {
// ... 既存のswitch分岐をそのまま保持 ...
}
}
stack.appendAssumeCapacity(item);
}
} else {
// Slow path fallback: 既存ロジック維持
var iter = try jsc.JSArrayIterator.init(current, global);
try stack.ensureUnusedCapacity(iter.len);
var any_arrays = false;
while (try iter.next()) |item| {
// ... 既存コード ...
}
}
},
```
**安全性**: fast path の内部で呼ばれる `toSlice()`, `asArrayBuffer()`, `item.as(Blob)` はいずれも入力配列を変更しない。butterfly ポインタは安定。
---
## Step 5: ベンチマーク
**ファイル**: `bench/snippets/blob-array-iteration.mjs` (新規)
```javascript
import { bench, run } from "../runner.mjs";
const N100 = Array.from({ length: 100 }, (_, i) => `chunk-${i}`);
const N1000 = Array.from({ length: 1000 }, (_, i) => `data-${i}`);
const N10000 = Array.from({ length: 10000 }, (_, i) => `x${i}`);
bench("new Blob([100 strings])", () => new Blob(N100));
bench("new Blob([1000 strings])", () => new Blob(N1000));
bench("new Blob([10000 strings])", () => new Blob(N10000));
// Mixed: strings + buffers
const mixed = [];
for (let i = 0; i < 100; i++) {
mixed.push(`text-${i}`);
mixed.push(new Uint8Array([i, i+1, i+2]));
}
bench("new Blob([100 strings + 100 buffers])", () => new Blob(mixed));
await run();
```
**実行方法**:
```bash
# リリースビルドのBun (最適化後)
bun run build && ./build/release/bun bench/snippets/blob-array-iteration.mjs
# システムの Bun (最適化前のベースライン)
bun bench/snippets/blob-array-iteration.mjs
```
---
## Step 6: ストレステスト
**ファイル**: `test/js/web/fetch/blob-array-fast-path.test.ts` (新規)
テストケース:
1. **基本ケース** — 文字列配列 → 結合結果検証
2. **大きな配列** — 10000要素 → 結合結果検証
3. **穴あり配列** — `[a, , b, , c]` → 穴はスキップ
4. **undefined/null要素** — スキップされること
5. **Proxy配列** — slow path フォールバック → 正しい結果
6. **prototype にゲッター** — `Object.defineProperty(Array.prototype, idx, {get})` → slow path フォールバック
7. **ネスト配列** — `[["a", "b"], "c"]` → 正しく展開
8. **Mixed types** — string + TypedArray + Blob → 正しく結合
9. **toString 副作用あり** — カスタムオブジェクトの toString → 正しい呼び出し順序
10. **空配列** — size === 0
11. **DerivedArray** — `class MyArray extends Array` → 動作保証
12. **COW (Copy-on-Write) 配列** — リテラル配列 `[1,2,3]` をそのまま渡す
13. **配列を凍結** — `Object.freeze([...])` → 正常動作
14. **sparse (ArrayStorage)** — `arr = []; arr[1000] = "x"` → slow path フォールバック正常動作
---
## Step 7: ビルドと検証
```bash
# 1. デバッグビルド + テスト実行
bun bd test test/js/web/fetch/blob-array-fast-path.test.ts
# 2. リリースビルド
bun run build
# 3. ベンチマーク (リリースビルド vs システムBun)
./build/release/bun bench/snippets/blob-array-iteration.mjs
bun bench/snippets/blob-array-iteration.mjs
```
---
## 期待効果
| 配列サイズ | イテレーション部分の改善 | 全体改善(推定) |
|-----------|------------------------|-----------------|
| 100要素 | ~10x (1.5μs → 0.15μs) | ~15-30% |
| 1000要素 | ~10x (15μs → 1.5μs) | ~20-40% |
| 10000要素 | ~10x (150μs → 15μs) | ~30-50% |
全体改善が10xにならない理由: per-element の `jsTypeLoose()` 呼び出し(C++) と `toSlice()` 処理が残るため。
</claude-plan>
Tests
Finding tests
Tests are located in the test/ directory and are organized using the following structure:
test/js/- tests for JavaScript APIs.cli/- tests for commands, configs, and stdout.bundler/- tests for the transpiler/bundler.regression/- tests that reproduce a specific issue.harness.ts- utility functions that can be imported from any test.
The tests in test/js/ directory are further categorized by the type of API.
test/js/bun/- tests forBun-specific APIs.node/- tests for Node.js APIs.web/- tests for Web APIs, likefetch().first_party/- tests for npm packages that are built-in, likeundici.third_party/- tests for npm packages that are not built-in, but are popular, likeesbuild.
Running tests
To run a test, use Bun's built-in test command: bun test.
bun test # Run all tests
bun test js/bun # Only run tests in a directory
bun test sqlite.test.ts # Only run a specific test
If you encounter lots of errors, try running bun install, then trying again.
Writing tests
Tests are written in TypeScript (preferred) or JavaScript using Jest's describe(), test(), and expect() APIs.
import { describe, test, expect } from "bun:test";
import { gcTick } from "harness";
describe("TextEncoder", () => {
test("can encode a string", async () => {
const encoder = new TextEncoder();
const actual = encoder.encode("bun");
await gcTick();
expect(actual).toBe(new Uint8Array([0x62, 0x75, 0x6E]));
});
});
If you are fixing a bug that was reported from a GitHub issue, remember to add a test in the test/regression/ directory.
// test/regression/issue/02005.test.ts
import { it, expect } from "bun:test";
it("regex literal should work with non-latin1", () => {
const text = "这是一段要替换的文字";
expect(text.replace(new RegExp("要替换"), "")).toBe("这是一段的文字");
expect(text.replace(/要替换/, "")).toBe("这是一段的文字");
});
In the future, a bot will automatically close or re-open issues when a regression is detected or resolved.
Zig tests
These tests live in various .zig files throughout Bun's codebase, leveraging Zig's builtin test keyword.
Currently, they're not run automatically nor is there a simple way to run all of them. We will make this better soon.
TypeScript
Test files should be written in TypeScript. The types in packages/bun-types should be updated to support all new APIs. Changes to the .d.ts files in packages/bun-types will be immediately reflected in test files; no build step is necessary.
Writing a test will often require using invalid syntax, e.g. when checking for errors when an invalid input is passed to a function. TypeScript provides a number of escape hatches here.
// @ts-expect-error- This should be your first choice. It tells TypeScript that the next line should fail typechecking.// @ts-ignore- Ignore the next line entirely.// @ts-nocheck- Put this at the top of the file to disable typechecking on the entire file. Useful for autogenerated test files, or when ignoring/disabling type checks an a per-line basis is too onerous.