Enable ReadableStream as stdin for Bun.spawn (#20582)

Co-authored-by: Jarred-Sumner <709451+Jarred-Sumner@users.noreply.github.com>
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: jarred <jarred@bun.sh>
Co-authored-by: pfg <pfg@pfg.pw>
This commit is contained in:
Jarred Sumner
2025-06-27 19:42:03 -07:00
committed by GitHub
parent 7839844abb
commit 1d48f91b5e
16 changed files with 2039 additions and 59 deletions

View File

@@ -113,6 +113,40 @@ proc.stdin.flush();
proc.stdin.end();
```
The `ReadableStream` option lets you pipe data from a JavaScript `ReadableStream` directly to the subprocess's input:
```ts
const stream = new ReadableStream({
start(controller) {
controller.enqueue("Hello from ");
controller.enqueue("ReadableStream!");
controller.close();
},
});
const proc = Bun.spawn(["cat"], {
stdin: stream,
stdout: "pipe",
});
const output = await new Response(proc.stdout).text();
console.log(output); // "Hello from ReadableStream!"
```
This is particularly useful for streaming data from HTTP responses, file streams, or any other source that provides a `ReadableStream`:
```ts
// Stream an HTTP response through a subprocess
const response = await fetch("https://example.com/large-file.txt");
const proc = Bun.spawn(["gzip"], {
stdin: response.body,
stdout: "pipe",
});
// Save the compressed output
await Bun.write("compressed.gz", proc.stdout);
```
## Output streams
You can read results from the subprocess via the `stdout` and `stderr` properties. By default these are instances of `ReadableStream`.

396
src/bun.js/STREAMS.md Normal file
View File

@@ -0,0 +1,396 @@
# **Bun Streams Architecture: High-Performance I/O in JavaScript**
### **Table of Contents**
1. [**Overview & Core Philosophy**](#1-overview--core-philosophy)
2. [**Foundational Concepts**](#2-foundational-concepts)
- 2.1. The Stream Tagging System: Enabling Optimization
- 2.2. The `Body` Mixin: An Intelligent Gateway
3. [**Deep Dive: The Major Performance Optimizations**](#3-deep-dive-the-major-performance-optimizations)
- 3.1. Optimization 1: Synchronous Coercion - Eliminating Streams Entirely
- 3.2. Optimization 2: The Direct Path - Zero-Copy Native Piping
- 3.3. Optimization 3: `readMany()` - Efficient Async Iteration
4. [**Low-Level Implementation Details**](#4-low-level-implementation-details)
- 4.1. The Native Language: `streams.zig` Primitives
- 4.2. Native Sink In-Depth: `HTTPSResponseSink` and Buffering
- 4.3. The Native Collector: `Body.ValueBufferer`
- 4.4. Memory and String Optimizations
5. [**The Unified System: A Complete Data Flow Example**](#5-the-unified-system-a-complete-data-flow-example)
6. [**Conclusion**](#6-conclusion)
---
## 1. Overview & Core Philosophy
Streams in Bun makes I/O performance in JavaScript competitive with lower-level languages like Go, Rust, and C, while presenting a fully WHATWG-compliant API.
The core philosophy is **"native-first, JS-fallback"**. Bun assumes that for many high-performance use cases, the JavaScript layer should act as a high-level controller for system-level I/O operations. We try to execute I/O operations with minimal abstraction cost, bypassing the JavaScript virtual machine entirely for performance-critical paths.
This document details the specific architectural patterns, from the JS/native boundary down to the I/O layer, that enable this level of performance.
## 2. Foundational Concepts
To understand Bun's stream optimizations, two foundational concepts must be understood first: the tagging system and the `Body` mixin's role as a state machine.
### 2.1. The Stream Tagging System: Enabling Optimization
Identifying the _source_ of a `ReadableStream` at the native level unlocks many optimization opportunities. This is achieved by "tagging" the stream object internally.
- **Mechanism:** Every `ReadableStream` in Bun holds a private field, `bunNativePtr`, which can point to a native Zig struct representing the stream's underlying source.
- **Identification:** A C++ binding, `ReadableStreamTag__tagged` (from `ReadableStream.zig`), is the primary entry point for this identification. When native code needs to consume a stream (e.g., when sending a `Response` body), it calls this function on the JS `ReadableStream` object to determine its origin.
```zig
// src/bun.js/webcore/ReadableStream.zig
pub const Tag = enum(i32) {
JavaScript = 0, // A generic, user-defined stream. This is the "slow path".
Blob = 1, // An in-memory blob. Fast path available.
File = 2, // Backed by a native file reader. Fast path available.
Bytes = 4, // Backed by a native network byte stream. Fast path available.
Direct = 3, // Internal native-to-native stream.
Invalid = -1,
};
```
This tag is the key that unlocks all subsequent optimizations. It allows the runtime to dispatch to the correct, most efficient implementation path.
### 2.2. The `Body` Mixin: An Intelligent Gateway
The `Body` mixin (used by `Request` and `Response`) is not merely a stream container; it's a sophisticated state machine and the primary API gateway to Bun's optimization paths. A `Body`'s content is represented by the `Body.Value` union in Zig, which can be a static buffer (`.InternalBlob`, `.WTFStringImpl`) or a live stream (`.Locked`).
Methods like `.text()`, `.json()`, and `.arrayBuffer()` are not simple stream consumers. They are entry points to a decision tree that aggressively seeks the fastest possible way to fulfill the request.
```mermaid
stateDiagram-v2
direction TB
[*] --> StaticBuffer : new Response("hello")
state StaticBuffer {
[*] --> Ready
Ready : Data in memory
Ready : .WTFStringImpl | .InternalBlob
}
StaticBuffer --> Locked : Access .body
StaticBuffer --> Used : .text() ⚡
state Locked {
[*] --> Streaming
Streaming : ReadableStream created
Streaming : Tagged (File/Bytes/etc)
}
Locked --> Used : consume stream
Used --> [*] : Complete
note right of StaticBuffer
Fast Path
Skip streams entirely!
end note
note right of Locked
Slow Path
Full streaming
end note
classDef buffer fill:#fbbf24,stroke:#92400e,stroke-width:3px,color:#451a03
classDef stream fill:#60a5fa,stroke:#1e40af,stroke-width:3px,color:#172554
classDef final fill:#34d399,stroke:#14532d,stroke-width:3px,color:#052e16
class StaticBuffer buffer
class Locked stream
class Used final
```
**Diagram 1: `Body.Value` State Transitions**
## 3. Deep Dive: The Major Performance Optimizations
### 3.1. Optimization 1: Synchronous Coercion - Eliminating Streams Entirely
This is the most impactful optimization for a vast number of common API and data processing tasks.
**The Conventional Problem:** In other JavaScript runtimes, consuming a response body with `.text()` is an inherently asynchronous, multi-step process involving the creation of multiple streams, readers, and promises, which incurs significant overhead.
**Bun's fast path:** Bun correctly assumes that for many real-world scenarios (e.g., small JSON API responses), the entire response body is already available in a single, contiguous memory buffer when the consuming method is called. It therefore **bypasses the entire stream processing model** and returns the buffer directly.
**Implementation Architecture & Data Flow:**
```mermaid
flowchart TB
A["response.text()"] --> B{Check Body Type}
B -->|"✅ Already Buffered<br/>(InternalBlob, etc.)"|C[⚡ FAST PATH]
B -->|"❌ Is Stream<br/>(.Locked)"|D[🐌 SLOW PATH]
subgraph fast[" "]
C --> C1[Get buffer pointer]
C1 --> C2[Decode to string]
C2 --> C3[Return resolved Promise]
end
subgraph slow[" "]
D --> D1[Create pending Promise]
D1 --> D2[Setup native buffering]
D2 --> D3[Collect all chunks]
D3 --> D4[Decode & resolve Promise]
end
C3 --> E["✨ Result available immediately<br/>(0 async operations)"]
D4 --> F["⏳ Result after I/O completes<br/>(multiple async operations)"]
style fast fill:#dcfce7,stroke:#166534,stroke-width:3px
style slow fill:#fee2e2,stroke:#991b1b,stroke-width:3px
style C fill:#22c55e,stroke:#166534,stroke-width:3px,color:#14532d
style C1 fill:#86efac,stroke:#166534,stroke-width:2px,color:#14532d
style C2 fill:#86efac,stroke:#166534,stroke-width:2px,color:#14532d
style C3 fill:#86efac,stroke:#166534,stroke-width:2px,color:#14532d
style D fill:#ef4444,stroke:#991b1b,stroke-width:3px,color:#ffffff
style D1 fill:#fca5a5,stroke:#991b1b,stroke-width:2px,color:#450a0a
style D2 fill:#fca5a5,stroke:#991b1b,stroke-width:2px,color:#450a0a
style D3 fill:#fca5a5,stroke:#991b1b,stroke-width:2px,color:#450a0a
style D4 fill:#fca5a5,stroke:#991b1b,stroke-width:2px,color:#450a0a
style E fill:#166534,stroke:#14532d,stroke-width:4px,color:#ffffff
style F fill:#dc2626,stroke:#991b1b,stroke-width:3px,color:#ffffff
```
**Diagram 2: Synchronous Coercion Logic Flow**
1. **Entry Point:** A JS call to `response.text()` triggers `readableStreamToText` (`ReadableStream.ts`), which immediately calls `tryUseReadableStreamBufferedFastPath`.
2. **Native Check:** `tryUseReadableStreamBufferedFastPath` calls the native binding `jsFunctionGetCompleteRequestOrResponseBodyValueAsArrayBuffer` (`Response.zig`).
3. **State Inspection:** This native function inspects the `Body.Value` tag. If the tag is `.InternalBlob`, `.Blob` (and not a disk-backed file), or `.WTFStringImpl`, the complete data is already in memory.
4. **Synchronous Data Transfer:** The function **synchronously** returns the underlying buffer as a native `ArrayBuffer` handle to JavaScript. The `Body` state is immediately transitioned to `.Used`. The buffer's ownership is often transferred (`.transfer` lifetime), avoiding a data copy.
5. **JS Resolution:** The JS layer receives a promise that is **already fulfilled** with the complete `ArrayBuffer`. It then performs the final conversion (e.g., `TextDecoder.decode()`) in a single step.
**Architectural Impact:** This optimization transforms a complex, multi-tick asynchronous operation into a single, synchronous native call followed by a single conversion step. The performance gain is an order of magnitude or more, as it eliminates the allocation and processing overhead of the entire stream and promise chain.
### 3.2. Optimization 2: The Direct Path - Zero-Copy Native Piping
This optimization targets high-throughput scenarios like serving files or proxying requests, where both the data source and destination are native.
**The Conventional Problem:** Piping a file to an HTTP response in other runtimes involves a costly per-chunk round trip through the JavaScript layer: `Native (read) -> JS (chunk as Uint8Array) -> JS (response.write) -> Native (socket)`.
**Bun's direct path:** Bun's runtime inspects the source and sink of a pipe. If it identifies a compatible native pair, it establishes a direct data channel between them entirely within the native layer.
**Implementation Architecture & Data Flow:**
```mermaid
%%{init: {'theme':'base', 'themeVariables': {'primaryColor':'#2563eb','primaryTextColor':'#fff','primaryBorderColor':'#3b82f6','lineColor':'#94a3b8','secondaryColor':'#fbbf24','background':'#f8fafc','mainBkg':'#ffffff','secondBkg':'#f1f5f9'}}}%%
graph TD
subgraph " "
subgraph js["🟨 JavaScript Layer"]
C["📄 new Response(file.stream())"]
end
subgraph native["⚡ Native Layer (Zig)"]
A["💾 Disk I/O<br><b>FileReader Source</b>"]
B["🔌 Socket Buffer<br><b>HTTPSResponseSink</b>"]
A -."🚀 Zero-Copy View<br>streams.Result.temporary".-> B
B -."🔙 Backpressure Signal".-> A
end
end
B ==>|"📡 Send"|D["🌐 Network"]
C ==>|"Direct Native<br>Connection"|A
style js fill:#fef3c7,stroke:#92400e,stroke-width:3px,color:#451a03
style native fill:#dbeafe,stroke:#1e40af,stroke-width:3px,color:#172554
style A fill:#60a5fa,stroke:#1e40af,stroke-width:2px,color:#172554
style B fill:#60a5fa,stroke:#1e40af,stroke-width:2px,color:#172554
style C fill:#fbbf24,stroke:#92400e,stroke-width:2px,color:#451a03
style D fill:#22c55e,stroke:#166534,stroke-width:2px,color:#ffffff
classDef jsClass fill:#fef3c7,stroke:#f59e0b,stroke-width:2px
classDef nativeClass fill:#dbeafe,stroke:#3b82f6,stroke-width:2px
classDef networkClass fill:#d1fae5,stroke:#10b981,stroke-width:2px
```
**Diagram 3: Direct Path for File Serving**
1. **Scenario:** A server handler returns `new Response(Bun.file("video.mp4").stream())`.
2. **Tagging:** The stream is created with a `File` tag, and its `bunNativePtr` points to a native `webcore.FileReader` struct. The HTTP server's response sink is a native `HTTPSResponseSink`.
3. **Connection via `assignToStream`:** The server's internal logic triggers `assignToStream` (`ReadableStreamInternals.ts`). This function detects the native source via its tag and dispatches to `readDirectStream`.
4. **Native Handoff:** `readDirectStream` calls the C++ binding `$startDirectStream`, which passes pointers to the native `FileReader` (source) and `HTTPSResponseSink` (sink) to the Zig engine.
5. **Zero-Copy Native Data Flow:** The Zig layer takes over. The `FileReader` reads a chunk from the disk. It yields a `streams.Result.temporary` variant, which is a **zero-copy view** into a shared read buffer. This view is passed directly to the `HTTPSResponseSink.write()` method, which appends it to its internal socket write buffer. When possible, Bun will skip the FileReader and use the `sendfile` system call for even less system call interactions.
**Architectural Impact:**
- **No Per-Chunk JS Execution:** The JavaScript event loop is not involved in the chunk-by-chunk transfer.
- **Zero Intermediate Copies:** Data moves from the kernel's page cache directly to the network socket's send buffer.
- **Hardware-Limited Throughput:** This architecture removes the runtime as a bottleneck, allowing I/O performance to be limited primarily by hardware speed.
### 3.3. Optimization 3: `readMany()` - Efficient Async Iteration
Bun optimizes the standard `for-await-of` loop syntax for streams.
**The Conventional Problem:** A naive `[Symbol.asyncIterator]` implementation calls `await reader.read()` for every chunk, which is inefficient if many small chunks arrive in quick succession.
**Bun's Solution:** Bun provides a custom, non-standard `reader.readMany()` method that synchronously drains the stream's entire internal buffer into a JavaScript array.
**Implementation Architecture & Data Flow:**
```mermaid
flowchart TB
subgraph trad["Traditional for-await-of"]
direction TB
T1["🔄 for await (chunk of stream)"]
T2["await read() → chunk1"]
T3["Process chunk1"]
T4["await read() → chunk2"]
T5["Process chunk2"]
T6["await read() → chunk3"]
T7["..."]
T1 --> T2 --> T3 --> T4 --> T5 --> T6 --> T7
end
subgraph bun["Bun's readMany() Optimization"]
direction TB
B1["🚀 for await (chunks of stream)"]
B2["readMany()"]
B3{"Buffer<br/>Status?"}
B4["⚡ Return [c1, c2, c3]<br/>SYNCHRONOUS"]
B5["Process ALL chunks<br/>in one go"]
B6["await (only if empty)"]
B1 --> B2
B2 --> B3
B3 -->|"Has Data"|B4
B3 -->|"Empty"|B6
B4 --> B5
B5 --> B2
B6 --> B2
end
trad --> P1["❌ Performance Impact<br/>• Promise per chunk<br/>• await per chunk<br/>• High overhead"]
bun --> P2["✅ Performance Win<br/>• Batch processing<br/>• Minimal promises<br/>• Low overhead"]
style trad fill:#fee2e2,stroke:#7f1d1d,stroke-width:3px
style bun fill:#dcfce7,stroke:#14532d,stroke-width:3px
style T2 fill:#ef4444,stroke:#7f1d1d,color:#ffffff
style T4 fill:#ef4444,stroke:#7f1d1d,color:#ffffff
style T6 fill:#ef4444,stroke:#7f1d1d,color:#ffffff
style B4 fill:#22c55e,stroke:#14532d,stroke-width:3px,color:#ffffff
style B5 fill:#22c55e,stroke:#14532d,stroke-width:3px,color:#ffffff
style P1 fill:#dc2626,stroke:#7f1d1d,stroke-width:3px,color:#ffffff
style P2 fill:#16a34a,stroke:#14532d,stroke-width:3px,color:#ffffff
```
**Diagram 4: `readMany()` Async Iterator Flow**
**Architectural Impact:** This pattern coalesces multiple chunks into a single macro-task. It drastically reduces the number of promise allocations and `await` suspensions required to process a stream, leading to significantly lower CPU usage and higher throughput for chunked data processing.
### **4. Low-Level Implementation Details**
The high-level optimizations are made possible by a robust and carefully designed native foundation in Zig.
#### **4.1. The Native Language: `streams.zig` Primitives**
The entire native architecture is built upon a set of generic, powerful Zig primitives that define the contracts for data flow.
- **`streams.Result` Union:** This is the universal data-carrying type for all native stream reads. Its variants are not just data containers; they are crucial signals from the source to the sink.
- `owned: bun.ByteList`: Represents a heap-allocated buffer. The receiver is now responsible for freeing this memory. This is used when data must outlive the current scope.
- `temporary: bun.ByteList`: A borrowed, read-only view into a source's internal buffer. This is the key to **zero-copy reads**, as the sink can process the data without taking ownership or performing a copy. It is only valid for the duration of the function call.
- `owned_and_done` / `temporary_and_done`: These variants bundle the final data chunk with the end-of-stream signal. This is a critical latency optimization, as it collapses two distinct events (data and close) into one, saving an I/O round trip.
- `into_array`: Used for BYOB (Bring-Your-Own-Buffer) readers. It contains a handle to the JS-provided `ArrayBufferView` (`value: JSValue`) and the number of bytes written (`len`). This confirms a zero-copy write directly into JS-managed memory.
- `pending: *Pending`: A handle to a future/promise, used to signal that the result is not yet available and the operation should be suspended.
- **`streams.Signal` V-Table:** This struct provides a generic, type-erased interface (`start`, `ready`, `close`) for a sink to communicate backpressure and state changes to a source.
- **`start()`**: Tells the source to begin producing data.
- **`ready()`**: The sink calls this to signal it has processed data and is ready for more, effectively managing backpressure.
- **`close()`**: The sink calls this to tell the source to stop, either due to completion or an error.
This v-table decouples native components, allowing any native source to be connected to any native sink without direct knowledge of each other's concrete types, which is essential for the Direct Path optimization.
#### **4.2. Native Sink In-Depth: `HTTPSResponseSink` and Buffering**
The `HTTPServerWritable` struct (instantiated as `HTTPSResponseSink` in `streams.zig`) is part of what makes Bun's HTTP server fast.
- **Intelligent Write Buffering:** The `write` method (`writeBytes`, `writeLatin1`, etc.) does not immediately issue a `write` syscall. It appends the incoming `streams.Result` slice to its internal `buffer: bun.ByteList`. This coalesces multiple small, high-frequency writes (common in streaming LLM responses or SSE) into a single, larger, more efficient syscall.
- **Backpressure Logic (`send` method):** The `send` method attempts to write the buffer to the underlying `uWebSockets` socket.
- It uses the optimized `res.tryEnd()` for the final chunk.
- If `res.write()` or `res.tryEnd()` returns a "backpressure" signal, the sink immediately sets `this.has_backpressure = true` and registers an `onWritable` callback.
- The `onWritable` callback is triggered by the OS/`uWebSockets` when the socket can accept more data. It clears the backpressure flag, attempts to send the rest of the buffered data, and then signals `ready()` back to the source stream via its `streams.Signal`. This creates a tight, efficient, native backpressure loop.
- **The Auto-Flusher (`onAutoFlush`):** This mechanism provides a perfect balance between throughput and latency.
- **Mechanism:** When `write` is called but the `highWaterMark` is not reached, `registerAutoFlusher` queues a task that runs AFTER all JavaScript microtasks are completed.
- **Execution:** The `onAutoFlush` method is executed by the event loop at the very end of the current tick, after all JavaScript microtasks are completed. It checks `!this.hasBackpressure()` and, if the buffer is not empty, calls `sendWithoutAutoFlusher` to flush the buffered data.
- **Architectural Impact:** This allows multiple `writer.write()` calls within a single synchronous block of JS code to be batched into one syscall, but guarantees that the data is sent immediately after the current JS task completes, ensuring low, predictable latency for real-time applications.
#### **4.3. The Native Collector: `Body.ValueBufferer`**
When a consuming method like `.text()` is called on a body that cannot be resolved synchronously, the `Body.ValueBufferer` (`Body.zig`) is used to efficiently collect all chunks into a single native buffer.
- **Instantiation:** A `Body.ValueBufferer` is created with a callback, `onFinishedBuffering`, which will be invoked upon completion to resolve the original JS promise.
- **Native Piping (`onStreamPipe`):** For a `ByteStream` source, the bufferer sets itself as the `pipe` destination. The `ByteStream.onData` method, instead of interacting with JavaScript, now directly calls the bufferer's `onStreamPipe` function. This function appends the received `streams.Result` slice to its internal `stream_buffer`. The entire collection loop happens natively.
- **Completion:** When a chunk with the `_and_done` flag is received, `onStreamPipe` calls the `onFinishedBuffering` callback, passing the final, fully concatenated buffer. This callback then resolves the original JavaScript promise.
**Architectural Impact:** This pattern ensures that even when a body must be fully buffered, the collection process is highly efficient. Data chunks are concatenated in native memory without repeatedly crossing the JS boundary, minimizing overhead.
#### **4.4. Memory and String Optimizations**
- **`Blob` and `Blob.Store` (`Blob.zig`):** A `Blob` is a lightweight handle to a `Blob.Store`. The store can be backed by memory (`.bytes`), a file (`.file`), or an S3 object (`.s3`). This allows Bun to implement optimized operations based on the blob's backing store (e.g., `Bun.write(file1, file2)` becomes a native file copy via `copy_file.zig`).
- **`Blob.slice()` as a Zero-Copy View:** `blob.slice()` is a constant-time operation that creates a new `Blob` handle pointing to the same store but with a different `offset` and `size`, avoiding any data duplication.
- **`is_all_ascii` Flag:** `Blob`s and `ByteStream`s track whether their content is known to be pure ASCII. This allows `.text()` to skip expensive UTF-8 validation and decoding for a large class of text-based data, treating the Latin-1 bytes directly as a string.
- **`WTFStringImpl` Integration:** Bun avoids copying JS strings by default, instead storing a pointer to WebKit's internal `WTF::StringImpl` (`Body.Value.WTFStringImpl`). The conversion to a UTF-8 byte buffer is deferred until it's absolutely necessary (e.g., writing to a socket), avoiding copies for string-based operations that might never touch the network.
## 5. The Unified System: A Complete Data Flow Example
This diagram illustrates how the components work together when a `fetch` response is consumed.
```mermaid
%%{init: {'theme':'base', 'themeVariables': {'primaryColor':'#2563eb','primaryTextColor':'#fff','primaryBorderColor':'#3b82f6','lineColor':'#94a3b8','secondaryColor':'#fbbf24','tertiaryColor':'#a78bfa','background':'#f8fafc','mainBkg':'#ffffff','secondBkg':'#f1f5f9'}}}%%
graph TD
subgraph flow["🚀 Response Consumption Flow"]
A["📱 JS Code"] --> B{"🎯 response.text()"}
B --> C{"❓ Is Body<br>Buffered?"}
C -->|"✅ Yes"|D["⚡ Optimization 1<br>Sync Coercion"]
C -->|"❌ No"|E{"❓ Is Stream<br>Native?"}
D --> F(("📄 Final String"))
E -->|"✅ Yes"|G["🚀 Optimization 2<br>Direct Pipe to<br>Native ValueBufferer"]
E -->|"❌ No"|H["🐌 JS Fallback<br>read() loop"]
G --> I{"💾 Native<br>Buffering"}
H --> I
I --> J["🔤 Decode<br>Buffer"]
J --> F
end
subgraph Legend
direction LR
L1("🟨 JS Layer")
L2("🟦 Native Layer")
style L1 fill:#fef3c7,stroke:#f59e0b,stroke-width:2px,color:#92400e
style L2 fill:#dbeafe,stroke:#3b82f6,stroke-width:2px,color:#1e40af
end
style flow fill:#f8fafc,stroke:#64748b,stroke-width:2px
style A fill:#fef3c7,stroke:#f59e0b,stroke-width:2px,color:#92400e
style B fill:#fef3c7,stroke:#f59e0b,stroke-width:2px,color:#92400e
style H fill:#fee2e2,stroke:#ef4444,stroke-width:2px,color:#991b1b
style C fill:#e0e7ff,stroke:#6366f1,stroke-width:2px,color:#4338ca
style E fill:#e0e7ff,stroke:#6366f1,stroke-width:2px,color:#4338ca
style D fill:#dbeafe,stroke:#3b82f6,stroke-width:3px,color:#1e40af
style G fill:#dbeafe,stroke:#3b82f6,stroke-width:3px,color:#1e40af
style I fill:#e0e7ff,stroke:#6366f1,stroke-width:2px,color:#4338ca
style J fill:#e0e7ff,stroke:#6366f1,stroke-width:2px,color:#4338ca
style F fill:#d1fae5,stroke:#10b981,stroke-width:4px,color:#065f46
```
**Diagram 5: Unified Consumption Flow**
1. User calls `response.text()`.
2. Bun checks if the body is already fully buffered in memory.
3. **Path 1 (Fastest):** If yes, it performs the **Synchronous Coercion** optimization and returns a resolved promise.
4. **Path 2 (Fast):** If no, it checks the stream's tag. If it's a native source (`File`, `Bytes`), it uses the **Direct Path** to pipe the stream to a native `Body.ValueBufferer`.
5. **Path 3 (Slowest):** If it's a generic `JavaScript` stream, it falls back to a JS-based `read()` loop that pushes chunks to the `Body.ValueBufferer`.
6. Once the bufferer is full, the final buffer is decoded and the original promise is resolved.
## 6. Conclusion
Streams in Bun aggressively optimize common paths, while providing a fully WHATWG-compliant API.
- **Key Architectural Principle:** Dispatching between generic and optimized paths based on runtime type information (tagging) is the central strategy.
- **Primary Optimizations:** The **Synchronous Coercion Fast Path** and the **Direct Native Piping Path** are the two most significant innovations, eliminating entire layers of abstraction for common use cases.
- **Supporting Optimizations:** Efficient async iteration (`readMany`), intelligent sink-side buffering (`AutoFlusher`), and careful memory management (`owned` vs. `temporary` buffers, object pooling) contribute to a system that is fast at every level.
This deep integration between the native and JavaScript layers allows Bun to deliver performance that rivals, and in many cases exceeds, that of systems written in lower-level languages, without sacrificing the productivity and ecosystem of JavaScript.

View File

@@ -23,6 +23,7 @@ pub const Stdio = union(enum) {
memfd: bun.FileDescriptor,
pipe,
ipc,
readable_stream: JSC.WebCore.ReadableStream,
const log = bun.sys.syslog;
@@ -78,6 +79,9 @@ pub const Stdio = union(enum) {
.memfd => |fd| {
fd.close();
},
.readable_stream => {
// ReadableStream cleanup is handled by the subprocess
},
else => {},
}
}
@@ -191,7 +195,7 @@ pub const Stdio = union(enum) {
break :brk .{ .buffer = {} };
},
.dup2 => .{ .dup2 = .{ .out = stdio.dup2.out, .to = stdio.dup2.to } },
.capture, .pipe, .array_buffer => .{ .buffer = {} },
.capture, .pipe, .array_buffer, .readable_stream => .{ .buffer = {} },
.ipc => .{ .ipc = {} },
.fd => |fd| .{ .pipe = fd },
.memfd => |fd| .{ .pipe = fd },
@@ -244,7 +248,7 @@ pub const Stdio = union(enum) {
break :brk .{ .buffer = bun.default_allocator.create(uv.Pipe) catch bun.outOfMemory() };
},
.ipc => .{ .ipc = bun.default_allocator.create(uv.Pipe) catch bun.outOfMemory() },
.capture, .pipe, .array_buffer => .{ .buffer = bun.default_allocator.create(uv.Pipe) catch bun.outOfMemory() },
.capture, .pipe, .array_buffer, .readable_stream => .{ .buffer = bun.default_allocator.create(uv.Pipe) catch bun.outOfMemory() },
.fd => |fd| .{ .pipe = fd },
.dup2 => .{ .dup2 = .{ .out = stdio.dup2.out, .to = stdio.dup2.to } },
.path => |pathlike| .{ .path = pathlike.slice() },
@@ -277,13 +281,67 @@ pub const Stdio = union(enum) {
pub fn isPiped(self: Stdio) bool {
return switch (self) {
.capture, .array_buffer, .blob, .pipe => true,
.capture, .array_buffer, .blob, .pipe, .readable_stream => true,
.ipc => Environment.isWindows,
else => false,
};
}
pub fn extract(out_stdio: *Stdio, globalThis: *JSC.JSGlobalObject, i: i32, value: JSValue) bun.JSError!void {
fn extractBodyValue(out_stdio: *Stdio, globalThis: *JSC.JSGlobalObject, i: i32, body: *JSC.WebCore.Body.Value, is_sync: bool) bun.JSError!void {
body.toBlobIfPossible();
if (body.tryUseAsAnyBlob()) |blob| {
return out_stdio.extractBlob(globalThis, blob, i);
}
switch (body.*) {
.Null, .Empty => {
out_stdio.* = .{ .ignore = {} };
return;
},
.Used => {
return globalThis.ERR(.BODY_ALREADY_USED, "Body already used", .{}).throw();
},
.Error => {
return globalThis.throwValue(body.Error.toJS(globalThis));
},
.Blob, .WTFStringImpl, .InternalBlob => unreachable, // handled above.
.Locked => {
if (is_sync) {
return globalThis.throwInvalidArguments("ReadableStream cannot be used in sync mode", .{});
}
switch (i) {
0 => {},
1 => {
return globalThis.throwInvalidArguments("ReadableStream cannot be used for stdout yet. For now, do .stdout", .{});
},
2 => {
return globalThis.throwInvalidArguments("ReadableStream cannot be used for stderr yet. For now, do .stderr", .{});
},
else => unreachable,
}
const stream_value = body.toReadableStream(globalThis);
if (globalThis.hasException()) {
return error.JSError;
}
const stream = JSC.WebCore.ReadableStream.fromJS(stream_value, globalThis) orelse return globalThis.throwInvalidArguments("Failed to create ReadableStream", .{});
if (stream.isDisturbed(globalThis)) {
return globalThis.ERR(.BODY_ALREADY_USED, "ReadableStream has already been used", .{}).throw();
}
out_stdio.* = .{ .readable_stream = stream };
},
}
return;
}
pub fn extract(out_stdio: *Stdio, globalThis: *JSC.JSGlobalObject, i: i32, value: JSValue, is_sync: bool) bun.JSError!void {
if (value == .zero) return;
if (value.isUndefined()) return;
if (value.isNull()) {
@@ -346,34 +404,36 @@ pub const Stdio = union(enum) {
} else if (value.as(JSC.WebCore.Blob)) |blob| {
return out_stdio.extractBlob(globalThis, .{ .Blob = blob.dupe() }, i);
} else if (value.as(JSC.WebCore.Request)) |req| {
req.getBodyValue().toBlobIfPossible();
return out_stdio.extractBlob(globalThis, req.getBodyValue().useAsAnyBlob(), i);
} else if (value.as(JSC.WebCore.Response)) |req| {
req.getBodyValue().toBlobIfPossible();
return out_stdio.extractBlob(globalThis, req.getBodyValue().useAsAnyBlob(), i);
} else if (JSC.WebCore.ReadableStream.fromJS(value, globalThis)) |req_const| {
var req = req_const;
if (i == 0) {
if (req.toAnyBlob(globalThis)) |blob| {
return out_stdio.extractBlob(globalThis, blob, i);
}
return extractBodyValue(out_stdio, globalThis, i, req.getBodyValue(), is_sync);
} else if (value.as(JSC.WebCore.Response)) |res| {
return extractBodyValue(out_stdio, globalThis, i, res.getBodyValue(), is_sync);
}
switch (req.ptr) {
.File, .Blob => {
globalThis.throwTODO("Support fd/blob backed ReadableStream in spawn stdin. See https://github.com/oven-sh/bun/issues/8049") catch {};
return error.JSError;
},
.Direct, .JavaScript, .Bytes => {
// out_stdio.* = .{ .connect = req };
globalThis.throwTODO("Re-enable ReadableStream support in spawn stdin. ") catch {};
return error.JSError;
},
.Invalid => {
return globalThis.throwInvalidArguments("ReadableStream is in invalid state.", .{});
},
}
if (JSC.WebCore.ReadableStream.fromJS(value, globalThis)) |stream_| {
var stream = stream_;
if (stream.toAnyBlob(globalThis)) |blob| {
return out_stdio.extractBlob(globalThis, blob, i);
}
} else if (value.asArrayBuffer(globalThis)) |array_buffer| {
const name: []const u8 = switch (i) {
0 => "stdin",
1 => "stdout",
2 => "stderr",
else => unreachable,
};
if (is_sync) {
return globalThis.throwInvalidArguments("'{s}' ReadableStream cannot be used in sync mode", .{name});
}
if (stream.isDisturbed(globalThis)) {
return globalThis.ERR(.INVALID_STATE, "'{s}' ReadableStream has already been used", .{name}).throw();
}
out_stdio.* = .{ .readable_stream = stream };
return;
}
if (value.asArrayBuffer(globalThis)) |array_buffer| {
// Change in Bun v1.0.34: don't throw for empty ArrayBuffer
if (array_buffer.byteSlice().len == 0) {
out_stdio.* = .{ .ignore = {} };

View File

@@ -56,7 +56,8 @@ pub const Flags = packed struct(u8) {
has_stdin_destructor_called: bool = false,
finalized: bool = false,
deref_on_stdin_destroyed: bool = false,
_: u3 = 0,
is_stdin_a_readable_stream: bool = false,
_: u2 = 0,
};
pub const SignalCode = bun.SignalCode;
@@ -446,6 +447,7 @@ const Readable = union(enum) {
.pipe => Readable{ .pipe = PipeReader.create(event_loop, process, result, max_size) },
.array_buffer, .blob => Output.panic("TODO: implement ArrayBuffer & Blob support in Stdio readable", .{}),
.capture => Output.panic("TODO: implement capture support in Stdio readable", .{}),
.readable_stream => Readable{ .ignore = {} }, // ReadableStream is handled separately
};
}
@@ -1269,16 +1271,17 @@ const Writable = union(enum) {
pub fn onStart(_: *Writable) void {}
pub fn init(
stdio: Stdio,
stdio: *Stdio,
event_loop: *JSC.EventLoop,
subprocess: *Subprocess,
result: StdioResult,
promise_for_stream: *JSC.JSValue,
) !Writable {
assertStdioResult(result);
if (Environment.isWindows) {
switch (stdio) {
.pipe => {
switch (stdio.*) {
.pipe, .readable_stream => {
if (result == .buffer) {
const pipe = JSC.WebCore.FileSink.createWithPipe(event_loop, result.buffer);
@@ -1287,6 +1290,9 @@ const Writable = union(enum) {
.err => |err| {
_ = err; // autofix
pipe.deref();
if (stdio.* == .readable_stream) {
stdio.readable_stream.cancel(event_loop.global);
}
return error.UnexpectedCreatingStdin;
},
}
@@ -1296,6 +1302,16 @@ const Writable = union(enum) {
subprocess.flags.deref_on_stdin_destroyed = true;
subprocess.flags.has_stdin_destructor_called = false;
if (stdio.* == .readable_stream) {
const assign_result = pipe.assignToStream(&stdio.readable_stream, event_loop.global);
if (assign_result.toError()) |err| {
pipe.deref();
subprocess.deref();
return event_loop.global.throwValue(err);
}
promise_for_stream.* = assign_result;
}
return Writable{
.pipe = pipe,
};
@@ -1332,14 +1348,14 @@ const Writable = union(enum) {
}
if (comptime Environment.isPosix) {
if (stdio == .pipe) {
if (stdio.* == .pipe) {
_ = bun.sys.setNonblocking(result.?);
}
}
switch (stdio) {
switch (stdio.*) {
.dup2 => @panic("TODO dup2 stdio"),
.pipe => {
.pipe, .readable_stream => {
const pipe = JSC.WebCore.FileSink.create(event_loop, result.?);
switch (pipe.writer.start(pipe.fd, true)) {
@@ -1347,16 +1363,30 @@ const Writable = union(enum) {
.err => |err| {
_ = err; // autofix
pipe.deref();
if (stdio.* == .readable_stream) {
stdio.readable_stream.cancel(event_loop.global);
}
return error.UnexpectedCreatingStdin;
},
}
pipe.writer.handle.poll.flags.insert(.socket);
subprocess.weak_file_sink_stdin_ptr = pipe;
subprocess.ref();
subprocess.flags.has_stdin_destructor_called = false;
subprocess.flags.deref_on_stdin_destroyed = true;
pipe.writer.handle.poll.flags.insert(.socket);
if (stdio.* == .readable_stream) {
const assign_result = pipe.assignToStream(&stdio.readable_stream, event_loop.global);
if (assign_result.toError()) |err| {
pipe.deref();
subprocess.deref();
return event_loop.global.throwValue(err);
}
promise_for_stream.* = assign_result;
}
return Writable{
.pipe = pipe,
@@ -1407,7 +1437,7 @@ const Writable = union(enum) {
// https://github.com/oven-sh/bun/pull/14092
bun.debugAssert(!subprocess.flags.deref_on_stdin_destroyed);
const debug_ref_count = if (Environment.isDebug) subprocess.ref_count else 0;
pipe.onAttachedProcessExit();
pipe.onAttachedProcessExit(&subprocess.process.status);
if (Environment.isDebug) {
bun.debugAssert(subprocess.ref_count.active_counts == debug_ref_count.active_counts);
}
@@ -1522,6 +1552,7 @@ pub fn onProcessExit(this: *Subprocess, process: *Process, status: bun.spawn.Sta
this.pid_rusage = rusage.*;
const is_sync = this.flags.is_sync;
this.clearAbortSignal();
defer this.deref();
defer this.disconnectIPC(true);
@@ -1532,7 +1563,7 @@ pub fn onProcessExit(this: *Subprocess, process: *Process, status: bun.spawn.Sta
jsc_vm.onSubprocessExit(process);
var stdin: ?*JSC.WebCore.FileSink = this.weak_file_sink_stdin_ptr;
var stdin: ?*JSC.WebCore.FileSink = if (this.stdin == .pipe and this.flags.is_stdin_a_readable_stream) this.stdin.pipe else this.weak_file_sink_stdin_ptr;
var existing_stdin_value = JSC.JSValue.zero;
if (this_jsvalue != .zero) {
if (JSC.Codegen.JSSubprocess.stdinGetCached(this_jsvalue)) |existing_value| {
@@ -1542,7 +1573,9 @@ pub fn onProcessExit(this: *Subprocess, process: *Process, status: bun.spawn.Sta
stdin = @alignCast(@ptrCast(JSC.WebCore.FileSink.JSSink.fromJS(existing_value)));
}
existing_stdin_value = existing_value;
if (!this.flags.is_stdin_a_readable_stream) {
existing_stdin_value = existing_value;
}
}
}
}
@@ -1580,8 +1613,9 @@ pub fn onProcessExit(this: *Subprocess, process: *Process, status: bun.spawn.Sta
if (stdin) |pipe| {
this.weak_file_sink_stdin_ptr = null;
this.flags.has_stdin_destructor_called = true;
// It is okay if it does call deref() here, as in that case it was truly ref'd.
pipe.onAttachedProcessExit();
pipe.onAttachedProcessExit(&status);
}
var did_update_has_pending_activity = false;
@@ -2054,7 +2088,7 @@ pub fn spawnMaybeSync(
var stdio_iter = try stdio_val.arrayIterator(globalThis);
var i: u31 = 0;
while (try stdio_iter.next()) |value| : (i += 1) {
try stdio[i].extract(globalThis, i, value);
try stdio[i].extract(globalThis, i, value, is_sync);
if (i == 2)
break;
}
@@ -2062,7 +2096,7 @@ pub fn spawnMaybeSync(
while (try stdio_iter.next()) |value| : (i += 1) {
var new_item: Stdio = undefined;
try new_item.extract(globalThis, i, value);
try new_item.extract(globalThis, i, value, is_sync);
const opt = switch (new_item.asSpawnOption(i)) {
.result => |opt| opt,
@@ -2081,15 +2115,15 @@ pub fn spawnMaybeSync(
}
} else {
if (try args.get(globalThis, "stdin")) |value| {
try stdio[0].extract(globalThis, 0, value);
try stdio[0].extract(globalThis, 0, value, is_sync);
}
if (try args.get(globalThis, "stderr")) |value| {
try stdio[2].extract(globalThis, 2, value);
try stdio[2].extract(globalThis, 2, value, is_sync);
}
if (try args.get(globalThis, "stdout")) |value| {
try stdio[1].extract(globalThis, 1, value);
try stdio[1].extract(globalThis, 1, value, is_sync);
}
}
@@ -2355,16 +2389,19 @@ pub fn spawnMaybeSync(
MaxBuf.createForSubprocess(subprocess, &subprocess.stderr_maxbuf, maxBuffer);
MaxBuf.createForSubprocess(subprocess, &subprocess.stdout_maxbuf, maxBuffer);
var promise_for_stream: JSC.JSValue = .zero;
// When run synchronously, subprocess isn't garbage collected
subprocess.* = Subprocess{
.globalThis = globalThis,
.process = process,
.pid_rusage = null,
.stdin = Writable.init(
stdio[0],
&stdio[0],
loop,
subprocess,
spawned.stdin,
&promise_for_stream,
) catch {
subprocess.deref();
return globalThis.throwOutOfMemory();
@@ -2408,6 +2445,27 @@ pub fn spawnMaybeSync(
subprocess.process.setExitHandler(subprocess);
promise_for_stream.ensureStillAlive();
subprocess.flags.is_stdin_a_readable_stream = promise_for_stream != .zero;
if (promise_for_stream != .zero and !globalThis.hasException()) {
if (promise_for_stream.toError()) |err| {
_ = globalThis.throwValue(err) catch {};
}
}
if (globalThis.hasException()) {
const err = globalThis.takeException(error.JSError);
// Ensure we kill the process so we don't leave things in an unexpected state.
_ = subprocess.tryKill(subprocess.killSignal);
if (globalThis.hasException()) {
return error.JSError;
}
return globalThis.throwValue(err);
}
var posix_ipc_info: if (Environment.isPosix) IPC.Socket else void = undefined;
if (Environment.isPosix and !is_sync) {
if (maybe_ipc_mode) |mode| {
@@ -2441,7 +2499,7 @@ pub fn spawnMaybeSync(
ipc_data.writeVersionPacket(globalThis);
}
if (subprocess.stdin == .pipe) {
if (subprocess.stdin == .pipe and promise_for_stream == .zero) {
subprocess.stdin.pipe.signal = JSC.WebCore.streams.Signal.init(&subprocess.stdin);
}
@@ -2473,6 +2531,10 @@ pub fn spawnMaybeSync(
JSC.Codegen.JSSubprocess.ipcCallbackSetCached(out, globalThis, ipc_callback);
}
if (stdio[0] == .readable_stream) {
JSC.Codegen.JSSubprocess.stdinSetCached(out, globalThis, stdio[0].readable_stream.value);
}
switch (subprocess.process.watch()) {
.result => {},
.err => {

View File

@@ -4423,6 +4423,10 @@ GlobalObject::PromiseFunctions GlobalObject::promiseHandlerID(Zig::FFIFunction h
return GlobalObject::PromiseFunctions::Bun__FileStreamWrapper__onResolveRequestStream;
} else if (handler == Bun__FileStreamWrapper__onRejectRequestStream) {
return GlobalObject::PromiseFunctions::Bun__FileStreamWrapper__onRejectRequestStream;
} else if (handler == Bun__FileSink__onResolveStream) {
return GlobalObject::PromiseFunctions::Bun__FileSink__onResolveStream;
} else if (handler == Bun__FileSink__onRejectStream) {
return GlobalObject::PromiseFunctions::Bun__FileSink__onRejectStream;
} else {
RELEASE_ASSERT_NOT_REACHED();
}

View File

@@ -379,8 +379,10 @@ public:
Bun__S3UploadStream__onResolveRequestStream,
Bun__FileStreamWrapper__onRejectRequestStream,
Bun__FileStreamWrapper__onResolveRequestStream,
Bun__FileSink__onResolveStream,
Bun__FileSink__onRejectStream,
};
static constexpr size_t promiseFunctionsSize = 34;
static constexpr size_t promiseFunctionsSize = 36;
static PromiseFunctions promiseHandlerID(SYSV_ABI EncodedJSValue (*handler)(JSC::JSGlobalObject* arg0, JSC::CallFrame* arg1));

View File

@@ -688,6 +688,9 @@ BUN_DECLARE_HOST_FUNCTION(Bun__HTTPRequestContext__onResolveStream);
BUN_DECLARE_HOST_FUNCTION(Bun__NodeHTTPRequest__onResolve);
BUN_DECLARE_HOST_FUNCTION(Bun__NodeHTTPRequest__onReject);
BUN_DECLARE_HOST_FUNCTION(Bun__FileSink__onResolveStream);
BUN_DECLARE_HOST_FUNCTION(Bun__FileSink__onRejectStream);
#endif
#ifdef __cplusplus

View File

@@ -24,6 +24,9 @@ fd: bun.FileDescriptor = bun.invalid_fd,
auto_flusher: webcore.AutoFlusher = .{},
run_pending_later: FlushPendingTask = .{},
/// Currently, only used when `stdin` in `Bun.spawn` is a ReadableStream.
readable_stream: JSC.WebCore.ReadableStream.Strong = .{},
const log = Output.scoped(.FileSink, false);
pub const RefCount = bun.ptr.RefCount(FileSink, "ref_count", deinit, .{});
@@ -72,13 +75,31 @@ comptime {
@export(&Bun__ForceFileSinkToBeSynchronousForProcessObjectStdio, .{ .name = "Bun__ForceFileSinkToBeSynchronousForProcessObjectStdio" });
}
pub fn onAttachedProcessExit(this: *FileSink) void {
pub fn onAttachedProcessExit(this: *FileSink, status: *const bun.spawn.Status) void {
log("onAttachedProcessExit()", .{});
this.done = true;
var readable_stream = this.readable_stream;
this.readable_stream = .{};
if (readable_stream.has()) {
if (this.event_loop_handle.globalObject()) |global| {
if (readable_stream.get(global)) |*stream| {
if (!status.isOK()) {
const event_loop = global.bunVM().eventLoop();
event_loop.enter();
defer event_loop.exit();
stream.cancel(global);
} else {
stream.done(global);
}
}
}
// Clean up the readable stream reference
readable_stream.deinit();
}
this.writer.close();
this.pending.result = .{ .err = .fromCode(.PIPE, .write) };
this.runPending();
if (this.must_be_kept_alive_until_eof) {
@@ -181,6 +202,14 @@ pub fn onReady(this: *FileSink) void {
pub fn onClose(this: *FileSink) void {
log("onClose()", .{});
if (this.readable_stream.has()) {
if (this.event_loop_handle.globalObject()) |global| {
if (this.readable_stream.get(global)) |stream| {
stream.done(global);
}
}
}
this.signal.close(null);
}
@@ -225,6 +254,11 @@ pub fn create(
}
pub fn setup(this: *FileSink, options: *const FileSink.Options) JSC.Maybe(void) {
if (this.readable_stream.has()) {
// Already started.
return .{ .result = {} };
}
const result = bun.io.openForWriting(
bun.FileDescriptor.cwd(),
options.input_path,
@@ -414,6 +448,7 @@ pub fn flushFromJS(this: *FileSink, globalThis: *JSGlobalObject, wait: bool) JSC
}
pub fn finalize(this: *FileSink) void {
this.readable_stream.deinit();
this.pending.deinit();
this.deref();
}
@@ -495,6 +530,7 @@ pub fn end(this: *FileSink, _: ?bun.sys.Error) JSC.Maybe(void) {
fn deinit(this: *FileSink) void {
this.pending.deinit();
this.writer.deinit();
this.readable_stream.deinit();
if (this.event_loop_handle.globalObject()) |global| {
webcore.AutoFlusher.unregisterDeferredMicrotaskWithType(@This(), this, global.bunVM());
}
@@ -611,6 +647,98 @@ pub const FlushPendingTask = struct {
}
};
/// Does not ref or unref.
fn handleResolveStream(this: *FileSink, globalThis: *JSC.JSGlobalObject) void {
if (this.readable_stream.get(globalThis)) |*stream| {
stream.done(globalThis);
}
if (!this.done) {
this.writer.close();
}
}
/// Does not ref or unref.
fn handleRejectStream(this: *FileSink, globalThis: *JSC.JSGlobalObject, _: JSC.JSValue) void {
if (this.readable_stream.get(globalThis)) |*stream| {
stream.abort(globalThis);
this.readable_stream.deinit();
}
if (!this.done) {
this.writer.close();
}
}
fn onResolveStream(globalThis: *JSC.JSGlobalObject, callframe: *JSC.CallFrame) bun.JSError!JSC.JSValue {
log("onResolveStream", .{});
var args = callframe.arguments();
var this: *@This() = args[args.len - 1].asPromisePtr(@This());
defer this.deref();
this.handleResolveStream(globalThis);
return .js_undefined;
}
fn onRejectStream(globalThis: *JSC.JSGlobalObject, callframe: *JSC.CallFrame) bun.JSError!JSC.JSValue {
log("onRejectStream", .{});
const args = callframe.arguments();
var this = args[args.len - 1].asPromisePtr(@This());
const err = args[0];
defer this.deref();
this.handleRejectStream(globalThis, err);
return .js_undefined;
}
pub fn assignToStream(this: *FileSink, stream: *JSC.WebCore.ReadableStream, globalThis: *JSGlobalObject) JSC.JSValue {
var signal = &this.signal;
signal.* = JSC.WebCore.FileSink.JSSink.SinkSignal.init(JSValue.zero);
this.ref();
defer this.deref();
// explicitly set it to a dead pointer
// we use this memory address to disable signals being sent
signal.clear();
this.readable_stream = .init(stream.*, globalThis);
const promise_result = JSC.WebCore.FileSink.JSSink.assignToStream(globalThis, stream.value, this, @as(**anyopaque, @ptrCast(&signal.ptr)));
if (promise_result.toError()) |err| {
this.readable_stream.deinit();
this.readable_stream = .{};
return err;
}
if (!promise_result.isEmptyOrUndefinedOrNull()) {
if (promise_result.asAnyPromise()) |promise| {
switch (promise.status(globalThis.vm())) {
.pending => {
this.writer.enableKeepingProcessAlive(this.event_loop_handle);
this.ref();
promise_result.then(globalThis, this, onResolveStream, onRejectStream);
},
.fulfilled => {
// These don't ref().
this.handleResolveStream(globalThis);
},
.rejected => {
// These don't ref().
this.handleRejectStream(globalThis, promise.result(globalThis.vm()));
},
}
}
}
return promise_result;
}
comptime {
const export_prefix = "Bun__FileSink";
if (bun.Environment.export_cpp_apis) {
@export(&JSC.toJSHostFn(onResolveStream), .{ .name = export_prefix ++ "__onResolveStream" });
@export(&JSC.toJSHostFn(onRejectStream), .{ .name = export_prefix ++ "__onRejectStream" });
}
}
const std = @import("std");
const bun = @import("bun");
const uv = bun.windows.libuv;

View File

@@ -162,8 +162,8 @@ function header() {
static size_t memoryCost(void* sinkPtr);
void* m_sinkPtr;
mutable WriteBarrier<JSC::Unknown> m_onPull;
mutable WriteBarrier<JSC::Unknown> m_onClose;
mutable WriteBarrier<JSC::JSObject> m_onPull;
mutable WriteBarrier<JSC::JSObject> m_onClose;
mutable JSC::Weak<JSObject> m_weakReadableStream;
uintptr_t m_onDestroy { 0 };
@@ -825,8 +825,16 @@ DEFINE_VISIT_CHILDREN(${className});
void ${controller}::start(JSC::JSGlobalObject *globalObject, JSC::JSValue readableStream, JSC::JSValue onPull, JSC::JSValue onClose) {
this->m_weakReadableStream = JSC::Weak<JSC::JSObject>(readableStream.getObject());
this->m_onPull.set(globalObject->vm(), this, onPull);
this->m_onClose.set(globalObject->vm(), this, onClose);
if (onPull) {
if (auto* object = onPull.getObject()) {
this->m_onPull.set(globalObject->vm(), this, object);
}
}
if (onClose) {
if (auto* object = onClose.getObject()) {
this->m_onClose.set(globalObject->vm(), this, object);
}
}
}
void ${className}::destroy(JSCell* cell)

View File

@@ -149,7 +149,7 @@ pub const ShellSubprocess = struct {
if (Environment.isWindows) {
switch (stdio) {
.pipe => {
.pipe, .readable_stream => {
if (result == .buffer) {
const pipe = JSC.WebCore.FileSink.createWithPipe(event_loop, result.buffer);
@@ -236,6 +236,10 @@ pub const ShellSubprocess = struct {
.ipc, .capture => {
return Writable{ .ignore = {} };
},
.readable_stream => {
// The shell never uses this
@panic("Unimplemented stdin readable_stream");
},
}
}
@@ -247,7 +251,7 @@ pub const ShellSubprocess = struct {
.pipe => |pipe| {
this.* = .{ .ignore = {} };
if (subprocess.process.hasExited() and !subprocess.flags.has_stdin_destructor_called) {
pipe.onAttachedProcessExit();
pipe.onAttachedProcessExit(&subprocess.process.status);
return pipe.toJS(globalThis);
} else {
subprocess.flags.has_stdin_destructor_called = false;
@@ -384,6 +388,7 @@ pub const ShellSubprocess = struct {
return readable;
},
.capture => Readable{ .pipe = PipeReader.create(event_loop, process, result, shellio, out_type) },
.readable_stream => Readable{ .ignore = {} }, // Shell doesn't use readable_stream
};
}
@@ -405,6 +410,7 @@ pub const ShellSubprocess = struct {
return readable;
},
.capture => Readable{ .pipe = PipeReader.create(event_loop, process, result, shellio, out_type) },
.readable_stream => Readable{ .ignore = {} }, // Shell doesn't use readable_stream
};
}

View File

@@ -46,7 +46,7 @@ const words: Record<string, { reason: string; limit?: number; regex?: boolean }>
"global.hasException": { reason: "Incompatible with strict exception checks. Use a CatchScope instead.", limit: 28 },
"globalObject.hasException": { reason: "Incompatible with strict exception checks. Use a CatchScope instead.", limit: 47 },
"globalThis.hasException": { reason: "Incompatible with strict exception checks. Use a CatchScope instead.", limit: 136 },
"globalThis.hasException": { reason: "Incompatible with strict exception checks. Use a CatchScope instead.", limit: 140 },
};
const words_keys = [...Object.keys(words)];

View File

@@ -0,0 +1,479 @@
/**
* Edge case tests for spawn with ReadableStream stdin.
*
* **IMPORTANT**: Many of these tests use `await` in ReadableStream constructors
* (e.g., `await Bun.sleep(0)`, `await 42`) to prevent Bun from optimizing
* the ReadableStream into a Blob. When a ReadableStream is synchronous and
* contains only string/buffer data, Bun may normalize it to a Blob for
* performance reasons. The `await` ensures the stream remains truly streaming
* and tests the actual ReadableStream code paths in spawn.
*/
import { spawn } from "bun";
import { describe, expect, test } from "bun:test";
import { bunEnv, bunExe } from "harness";
describe("spawn stdin ReadableStream edge cases", () => {
test("ReadableStream with exception in pull", async () => {
let pullCount = 0;
const stream = new ReadableStream({
pull(controller) {
pullCount++;
if (pullCount === 1) {
controller.enqueue("chunk 1\n");
} else if (pullCount === 2) {
controller.enqueue("chunk 2\n");
throw new Error("Pull error");
}
},
});
const proc = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream,
stdout: "pipe",
env: bunEnv,
});
const text = await new Response(proc.stdout).text();
// Should receive data before the exception
expect(text).toContain("chunk 1\n");
expect(text).toContain("chunk 2\n");
});
test("ReadableStream writing after process closed", async () => {
let writeAttempts = 0;
let errorOccurred = false;
const stream = new ReadableStream({
async pull(controller) {
writeAttempts++;
if (writeAttempts <= 10) {
await Bun.sleep(100);
try {
controller.enqueue(`attempt ${writeAttempts}\n`);
} catch (e) {
errorOccurred = true;
throw e;
}
} else {
controller.close();
}
},
});
// Use a command that exits quickly after reading one line
const proc = spawn({
cmd: [
bunExe(),
"-e",
`const readline = require('readline');
const rl = readline.createInterface({
input: process.stdin,
output: process.stdout,
terminal: false
});
rl.on('line', (line) => {
console.log(line);
process.exit(0);
});`,
],
stdin: stream,
stdout: "pipe",
env: bunEnv,
});
const text = await new Response(proc.stdout).text();
await proc.exited;
// Give time for more pull attempts
await Bun.sleep(500);
// The stream should have attempted multiple writes but only the first succeeded
expect(writeAttempts).toBeGreaterThanOrEqual(1);
expect(text).toBe("attempt 1\n");
});
test("ReadableStream with mixed types", async () => {
const stream = new ReadableStream({
start(controller) {
// String
controller.enqueue("text ");
// Uint8Array
controller.enqueue(new TextEncoder().encode("binary "));
// ArrayBuffer
const buffer = new ArrayBuffer(5);
const view = new Uint8Array(buffer);
view.set([100, 97, 116, 97, 32]); // "data "
controller.enqueue(buffer);
// Another string
controller.enqueue("end");
controller.close();
},
});
const proc = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream,
stdout: "pipe",
env: bunEnv,
});
const text = await new Response(proc.stdout).text();
expect(text).toBe("text binary data end");
expect(await proc.exited).toBe(0);
});
test("ReadableStream with process consuming data slowly", async () => {
const chunks: string[] = [];
for (let i = 0; i < 10; i++) {
chunks.push(`chunk ${i}\n`);
}
let currentChunk = 0;
const stream = new ReadableStream({
pull(controller) {
if (currentChunk < chunks.length) {
controller.enqueue(chunks[currentChunk]);
currentChunk++;
} else {
controller.close();
}
},
});
// Use a script that reads slowly
const proc = spawn({
cmd: [
bunExe(),
"-e",
`
const readline = require('readline');
const rl = readline.createInterface({
input: process.stdin,
output: process.stdout,
terminal: false
});
rl.on('line', async (line) => {
await Bun.sleep(10);
console.log(line);
});
`,
],
stdin: stream,
stdout: "pipe",
env: bunEnv,
});
const text = await new Response(proc.stdout).text();
const lines = text.trim().split("\n");
expect(lines.length).toBe(10);
for (let i = 0; i < 10; i++) {
expect(lines[i]).toBe(`chunk ${i}`);
}
expect(await proc.exited).toBe(0);
});
test.todo("ReadableStream with cancel callback verification", async () => {
let cancelReason: any = null;
let cancelCalled = false;
const stream = new ReadableStream({
start(controller) {
// Start sending data
let count = 0;
const interval = setInterval(() => {
count++;
try {
controller.enqueue(`data ${count}\n`);
} catch (e) {
clearInterval(interval);
}
}, 50);
// Store interval for cleanup
(controller as any).interval = interval;
},
cancel(reason) {
cancelCalled = true;
cancelReason = reason;
// Clean up interval if exists
if ((this as any).interval) {
clearInterval((this as any).interval);
}
},
});
// Kill the process after some data
const proc = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream,
stdout: "pipe",
env: bunEnv,
});
// Wait a bit then kill
await Bun.sleep(150);
proc.kill();
try {
await proc.exited;
} catch (e) {
// Expected - process was killed
}
// Give time for cancel to be called
await Bun.sleep(50);
expect(cancelCalled).toBe(true);
});
test("ReadableStream with high frequency small chunks", async () => {
const totalChunks = 1000;
let sentChunks = 0;
const stream = new ReadableStream({
pull(controller) {
// Send multiple small chunks per pull
for (let i = 0; i < 10 && sentChunks < totalChunks; i++) {
controller.enqueue(`${sentChunks}\n`);
sentChunks++;
}
if (sentChunks >= totalChunks) {
controller.close();
}
},
});
const proc = spawn({
cmd: [
bunExe(),
"-e",
`let count = 0;
const readline = require('readline');
const rl = readline.createInterface({
input: process.stdin,
output: process.stdout,
terminal: false
});
rl.on('line', () => count++);
rl.on('close', () => console.log(count));`,
],
stdin: stream,
stdout: "pipe",
env: bunEnv,
});
const text = await new Response(proc.stdout).text();
expect(parseInt(text.trim())).toBe(totalChunks);
expect(await proc.exited).toBe(0);
});
test("ReadableStream with several pulls", async () => {
let pullCount = 0;
const stream = new ReadableStream({
pull(controller) {
pullCount++;
if (pullCount <= 5) {
// Enqueue data larger than high water mark
controller.enqueue(Buffer.alloc(1024, "x"));
} else {
controller.close();
}
},
});
const proc = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream,
stdout: "pipe",
env: bunEnv,
});
const text = await new Response(proc.stdout).text();
expect(text).toBe("x".repeat(1024 * 5));
expect(await proc.exited).toBe(0);
// TODO: this is not quite right. But it's still godo to have
expect(pullCount).toBe(6);
});
test("ReadableStream reuse prevention", async () => {
const stream = new ReadableStream({
start(controller) {
controller.enqueue("test data");
controller.close();
},
});
// First use
const proc1 = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream,
stdout: "pipe",
env: bunEnv,
});
const text1 = await new Response(proc1.stdout).text();
expect(text1).toBe("test data");
expect(await proc1.exited).toBe(0);
// Second use should fail
expect(() => {
spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream,
env: bunEnv,
});
}).toThrow();
});
test("ReadableStream with byte stream", async () => {
const data = new Uint8Array(256);
for (let i = 0; i < 256; i++) {
data[i] = i;
}
const stream = new ReadableStream({
type: "bytes",
start(controller) {
// Enqueue as byte chunks
controller.enqueue(data.slice(0, 128));
controller.enqueue(data.slice(128, 256));
controller.close();
},
});
const proc = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream,
stdout: "pipe",
env: bunEnv,
});
const buffer = await new Response(proc.stdout).arrayBuffer();
const result = new Uint8Array(buffer);
expect(result).toEqual(data);
expect(await proc.exited).toBe(0);
});
test("ReadableStream with stdin and other pipes", async () => {
const stream = new ReadableStream({
start(controller) {
controller.enqueue("stdin data");
controller.close();
},
});
// Create a script that also writes to stdout and stderr
const script = `
process.stdin.on('data', (data) => {
process.stdout.write('stdout: ' + data);
process.stderr.write('stderr: ' + data);
});
`;
const proc = spawn({
cmd: [bunExe(), "-e", script],
stdin: stream,
stdout: "pipe",
stderr: "pipe",
env: bunEnv,
});
const [stdout, stderr] = await Promise.all([new Response(proc.stdout).text(), new Response(proc.stderr).text()]);
expect(stdout).toBe("stdout: stdin data");
expect(stderr).toBe("stderr: stdin data");
expect(await proc.exited).toBe(0);
});
test("ReadableStream with very long single chunk", async () => {
// Create a chunk larger than typical pipe buffer (64KB on most systems)
const size = 256 * 1024; // 256KB
const chunk = "a".repeat(size);
const stream = new ReadableStream({
start(controller) {
controller.enqueue(chunk);
controller.close();
},
});
const proc = spawn({
cmd: [
bunExe(),
"-e",
`let count = 0;
process.stdin.on('data', (chunk) => count += chunk.length);
process.stdin.on('end', () => console.log(count));`,
],
stdin: stream,
stdout: "pipe",
env: bunEnv,
});
const text = await new Response(proc.stdout).text();
expect(parseInt(text.trim())).toBe(size);
expect(await proc.exited).toBe(0);
});
test("ReadableStream with alternating data types", async () => {
const stream = new ReadableStream({
async pull(controller) {
await Bun.sleep(0);
// Alternate between strings and Uint8Arrays
controller.enqueue("string1 ");
controller.enqueue(new TextEncoder().encode("binary1 "));
controller.enqueue("string2 ");
controller.enqueue(new TextEncoder().encode("binary2"));
controller.close();
},
});
await using proc = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream,
stdout: "pipe",
env: bunEnv,
});
const text = await new Response(proc.stdout).text();
expect(text).toBe("string1 binary1 string2 binary2");
expect(await proc.exited).toBe(0);
});
test("ReadableStream with spawn options variations", async () => {
// Test with different spawn configurations
const configs = [
{ stdout: "pipe", stderr: "ignore" },
{ stdout: "pipe", stderr: "pipe" },
{ stdout: "pipe", stderr: "inherit" },
];
for (const config of configs) {
const stream = new ReadableStream({
async pull(controller) {
await Bun.sleep(0);
controller.enqueue("test input");
controller.close();
},
});
const proc = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream,
...config,
env: bunEnv,
});
const stdout = await new Response(proc.stdout).text();
expect(stdout).toBe("test input");
expect(await proc.exited).toBe(0);
}
});
});

View File

@@ -0,0 +1,181 @@
import { spawn } from "bun";
import { describe, expect, test } from "bun:test";
import { bunEnv, bunExe } from "harness";
describe("spawn stdin ReadableStream integration", () => {
test("example from documentation", async () => {
const stream = new ReadableStream({
async pull(controller) {
await Bun.sleep(1);
controller.enqueue("some data from a stream");
controller.close();
},
});
const proc = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream,
stdout: "pipe",
env: bunEnv,
});
const text = await new Response(proc.stdout).text();
console.log(text); // "some data from a stream"
expect(text).toBe("some data from a stream");
});
test("piping HTTP response to process", async () => {
using server = Bun.serve({
port: 0,
async fetch(req) {
return new Response(async function* () {
yield "Line 1\n";
yield "Line 2\n";
yield "Line 3\n";
});
},
});
// Count lines using Bun subprocess
const proc = spawn({
cmd: [
bunExe(),
"-e",
/*js*/ `
let count = 0;
const readline = require('readline');
const rl = readline.createInterface({
input: process.stdin,
output: process.stdout,
terminal: false
});
rl.on('line', () => count++);
rl.on('close', () => console.log(count));`,
],
stdin: await fetch(server.url),
stdout: "pipe",
env: bunEnv,
});
const output = await new Response(proc.stdout).text();
expect(parseInt(output.trim())).toBe(3);
});
test("transforming data before passing to process", async () => {
// Original data stream
const dataStream = new ReadableStream({
async pull(controller) {
await Bun.sleep(1);
controller.enqueue("hello world");
controller.enqueue("\n");
controller.enqueue("foo bar");
controller.close();
},
});
// Transform to uppercase
const upperCaseTransform = new TransformStream({
transform(chunk, controller) {
controller.enqueue(chunk.toUpperCase());
},
});
// Pipe through transform then to process
const transformedStream = dataStream.pipeThrough(upperCaseTransform);
const proc = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: transformedStream,
stdout: "pipe",
env: bunEnv,
});
const result = await new Response(proc.stdout).text();
expect(result).toBe("HELLO WORLD\nFOO BAR");
});
test("streaming large file through process", async () => {
// Simulate streaming a large file in chunks
const chunkSize = 1024;
const numChunks = 100;
let currentChunk = 0;
const fileStream = new ReadableStream({
pull(controller) {
if (currentChunk < numChunks) {
// Simulate file chunk
controller.enqueue(`Chunk ${currentChunk}: ${"x".repeat(chunkSize - 20)}\n`);
currentChunk++;
} else {
controller.close();
}
},
});
// Process the stream (just echo it for cross-platform compatibility)
const proc = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: fileStream,
stdout: "pipe",
env: bunEnv,
});
const result = await new Response(proc.stdout).text();
const lines = result.trim().split("\n");
expect(lines.length).toBe(numChunks);
expect(lines[0]).toStartWith("Chunk 0:");
expect(lines[99]).toStartWith("Chunk 99:");
});
test("real-time data processing", async () => {
let dataPoints = 0;
const maxDataPoints = 5;
// Simulate real-time data stream
const dataStream = new ReadableStream({
async pull(controller) {
if (dataPoints < maxDataPoints) {
const timestamp = Date.now();
const value = Math.random() * 100;
controller.enqueue(`${timestamp},${value.toFixed(2)}\n`);
dataPoints++;
// Simulate real-time delay
await Bun.sleep(10);
} else {
controller.close();
}
},
});
// Process the CSV data using Bun
const proc = spawn({
cmd: [
bunExe(),
"-e",
`let sum = 0, count = 0;
const readline = require('readline');
const rl = readline.createInterface({
input: process.stdin,
output: process.stdout,
terminal: false
});
rl.on('line', (line) => {
const [_, value] = line.split(',');
sum += parseFloat(value);
count++;
});
rl.on('close', () => console.log(sum / count));`,
],
stdin: dataStream,
stdout: "pipe",
env: bunEnv,
});
const avgStr = await new Response(proc.stdout).text();
const avg = parseFloat(avgStr.trim());
// Average should be between 0 and 100
expect(avg).toBeGreaterThanOrEqual(0);
expect(avg).toBeLessThanOrEqual(100);
});
});

View File

@@ -0,0 +1,23 @@
import { spawnSync } from "bun";
import { describe, expect, test } from "bun:test";
import { bunExe } from "harness";
describe("spawnSync with ReadableStream stdin", () => {
test("spawnSync should throw", () => {
const stream = new ReadableStream({
async start(controller) {
await 42;
controller.enqueue("test data");
controller.close();
},
});
expect(() =>
spawnSync({
cmd: [bunExe()],
stdin: stream,
stdout: "pipe",
}),
).toThrowErrorMatchingInlineSnapshot(`"'stdin' ReadableStream cannot be used in sync mode"`);
});
});

View File

@@ -0,0 +1,590 @@
import { spawn } from "bun";
import { describe, expect, mock, test } from "bun:test";
import { bunEnv, bunExe, expectMaxObjectTypeCount, isASAN, isCI } from "harness";
describe("spawn stdin ReadableStream", () => {
test("basic ReadableStream as stdin", async () => {
const stream = new ReadableStream({
start(controller) {
controller.enqueue("hello from stream");
controller.close();
},
});
await using proc = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream,
stdout: "pipe",
env: bunEnv,
});
const text = await new Response(proc.stdout).text();
expect(text).toBe("hello from stream");
expect(await proc.exited).toBe(0);
});
test("ReadableStream with multiple chunks", async () => {
const chunks = ["chunk1\n", "chunk2\n", "chunk3\n"];
const stream = new ReadableStream({
start(controller) {
for (const chunk of chunks) {
controller.enqueue(chunk);
}
controller.close();
},
});
await using proc = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream,
stdout: "pipe",
env: bunEnv,
});
const text = await new Response(proc.stdout).text();
expect(text).toBe(chunks.join(""));
expect(await proc.exited).toBe(0);
});
test("ReadableStream with Uint8Array chunks", async () => {
const encoder = new TextEncoder();
const stream = new ReadableStream({
start(controller) {
controller.enqueue(encoder.encode("binary "));
controller.enqueue(encoder.encode("data "));
controller.enqueue(encoder.encode("stream"));
controller.close();
},
});
await using proc = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream,
stdout: "pipe",
env: bunEnv,
});
const text = await new Response(proc.stdout).text();
expect(text).toBe("binary data stream");
expect(await proc.exited).toBe(0);
});
test("ReadableStream with delays between chunks", async () => {
const stream = new ReadableStream({
async start(controller) {
controller.enqueue("first\n");
await Bun.sleep(50);
controller.enqueue("second\n");
await Bun.sleep(50);
controller.enqueue("third\n");
controller.close();
},
});
await using proc = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream,
stdout: "pipe",
env: bunEnv,
});
const text = await new Response(proc.stdout).text();
expect(text).toBe("first\nsecond\nthird\n");
expect(await proc.exited).toBe(0);
});
test("ReadableStream with pull method", async () => {
let pullCount = 0;
const stream = new ReadableStream({
pull(controller) {
pullCount++;
if (pullCount <= 3) {
controller.enqueue(`pull ${pullCount}\n`);
} else {
controller.close();
}
},
});
await using proc = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream,
stdout: "pipe",
env: bunEnv,
});
const text = await new Response(proc.stdout).text();
expect(text).toBe("pull 1\npull 2\npull 3\n");
expect(await proc.exited).toBe(0);
});
test("ReadableStream with async pull and delays", async () => {
let pullCount = 0;
const stream = new ReadableStream({
async pull(controller) {
pullCount++;
if (pullCount <= 3) {
await Bun.sleep(30);
controller.enqueue(`async pull ${pullCount}\n`);
} else {
controller.close();
}
},
});
await using proc = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream,
stdout: "pipe",
env: bunEnv,
});
const text = await new Response(proc.stdout).text();
expect(text).toBe("async pull 1\nasync pull 2\nasync pull 3\n");
expect(await proc.exited).toBe(0);
});
test("ReadableStream with large data", async () => {
const largeData = "x".repeat(1024 * 1024); // 1MB
const stream = new ReadableStream({
start(controller) {
controller.enqueue(largeData);
controller.close();
},
});
await using proc = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream,
stdout: "pipe",
env: bunEnv,
});
const text = await new Response(proc.stdout).text();
expect(text).toBe(largeData);
expect(await proc.exited).toBe(0);
});
test("ReadableStream with very large chunked data", async () => {
const chunkSize = 64 * 1024; // 64KB chunks
const numChunks = 16; // 1MB total
let pushedChunks = 0;
const stream = new ReadableStream({
pull(controller) {
if (pushedChunks < numChunks) {
controller.enqueue("x".repeat(chunkSize));
pushedChunks++;
} else {
controller.close();
}
},
});
await using proc = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream,
stdout: "pipe",
env: bunEnv,
});
const text = await new Response(proc.stdout).text();
expect(text.length).toBe(chunkSize * numChunks);
expect(text).toBe("x".repeat(chunkSize * numChunks));
expect(await proc.exited).toBe(0);
});
test.todo("ReadableStream cancellation when process exits early", async () => {
let cancelled = false;
let chunksEnqueued = 0;
const stream = new ReadableStream({
async pull(controller) {
// Keep enqueueing data slowly
await Bun.sleep(50);
chunksEnqueued++;
controller.enqueue(`chunk ${chunksEnqueued}\n`);
},
cancel(_reason) {
cancelled = true;
},
});
await using proc = spawn({
cmd: [
bunExe(),
"-e",
`const readline = require('readline');
const rl = readline.createInterface({
input: process.stdin,
output: process.stdout,
terminal: false
});
let lines = 0;
rl.on('line', (line) => {
console.log(line);
lines++;
if (lines >= 2) process.exit(0);
});`,
],
stdin: stream,
stdout: "pipe",
env: bunEnv,
});
const text = await new Response(proc.stdout).text();
await proc.exited;
// Give some time for cancellation to happen
await Bun.sleep(100);
expect(cancelled).toBe(true);
expect(chunksEnqueued).toBeGreaterThanOrEqual(2);
// head -n 2 should only output 2 lines
expect(text.trim().split("\n").length).toBe(2);
});
test("ReadableStream error handling", async () => {
const stream = new ReadableStream({
async start(controller) {
controller.enqueue("before error\n");
// Give time for the data to be consumed
await Bun.sleep(10);
controller.error(new Error("Stream error"));
},
});
await using proc = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream,
stdout: "pipe",
env: bunEnv,
});
const text = await new Response(proc.stdout).text();
// Process should receive data before the error
expect(text).toBe("before error\n");
// Process should exit normally (the stream error happens after data is sent)
expect(await proc.exited).toBe(0);
});
test("ReadableStream with process that exits immediately", async () => {
const stream = new ReadableStream({
start(controller) {
// Enqueue a lot of data
for (let i = 0; i < 1000; i++) {
controller.enqueue(`line ${i}\n`);
}
controller.close();
},
});
await using proc = spawn({
cmd: [bunExe(), "-e", "process.exit(0)"], // exits immediately
stdin: stream,
env: bunEnv,
});
expect(await proc.exited).toBe(0);
// Give time for any pending operations
await Bun.sleep(50);
// The stream might be cancelled since the process exits before reading
// This is implementation-dependent behavior
});
test("ReadableStream with process that fails", async () => {
const stream = new ReadableStream({
async pull(controller) {
await Bun.sleep(0);
controller.enqueue("data for failing process\n");
controller.close();
},
});
await using proc = spawn({
cmd: [bunExe(), "-e", "process.exit(1)"],
stdin: stream,
env: bunEnv,
});
expect(await proc.exited).toBe(1);
});
test("already disturbed ReadableStream throws error", async () => {
const stream = new ReadableStream({
async pull(controller) {
await Bun.sleep(0);
controller.enqueue("data");
controller.close();
},
});
// Disturb the stream by reading from it
const reader = stream.getReader();
await reader.read();
reader.releaseLock();
expect(() => {
const proc = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream,
env: bunEnv,
});
}).toThrow("'stdin' ReadableStream has already been used");
});
test("ReadableStream with abort signal calls cancel", async () => {
const controller = new AbortController();
const cancel = mock();
const stream = new ReadableStream({
start(controller) {
controller.enqueue("data before abort\n");
},
async pull(controller) {
// Keep the stream open
// but don't block the event loop.
await Bun.sleep(1);
controller.enqueue("more data\n");
},
cancel,
});
await using proc = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream,
stdout: "pipe",
signal: controller.signal,
env: bunEnv,
});
// Give it some time to start
await Bun.sleep(10);
// Abort the process
controller.abort();
try {
await proc.exited;
} catch (e) {
// Process was aborted
}
// The process should have been killed
expect(proc.killed).toBe(true);
expect(cancel).toHaveBeenCalledTimes(1);
});
test("ReadableStream with backpressure", async () => {
let pullCalls = 0;
const maxChunks = 5;
const stream = new ReadableStream({
async pull(controller) {
pullCalls++;
if (pullCalls <= maxChunks) {
// Add async to prevent optimization to blob
await Bun.sleep(0);
controller.enqueue(`chunk ${pullCalls}\n`);
} else {
controller.close();
}
},
});
await using proc = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream,
stdout: "pipe",
env: bunEnv,
});
const text = await new Response(proc.stdout).text();
await proc.exited;
// The pull method should have been called multiple times
expect(pullCalls).toBeGreaterThan(1);
expect(pullCalls).toBeLessThanOrEqual(maxChunks + 1); // +1 for the close pull
expect(text).toContain("chunk 1\n");
expect(text).toContain(`chunk ${maxChunks}\n`);
});
test("ReadableStream with multiple processes", async () => {
const stream1 = new ReadableStream({
start(controller) {
controller.enqueue("stream1 data");
controller.close();
},
});
const stream2 = new ReadableStream({
start(controller) {
controller.enqueue("stream2 data");
controller.close();
},
});
await using proc1 = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream1,
stdout: "pipe",
env: bunEnv,
});
await using proc2 = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream2,
stdout: "pipe",
env: bunEnv,
});
const [text1, text2] = await Promise.all([new Response(proc1.stdout).text(), new Response(proc2.stdout).text()]);
expect(text1).toBe("stream1 data");
expect(text2).toBe("stream2 data");
expect(await proc1.exited).toBe(0);
expect(await proc2.exited).toBe(0);
});
test("ReadableStream with empty stream", async () => {
const stream = new ReadableStream({
start(controller) {
// Close immediately without enqueueing anything
controller.close();
},
});
await using proc = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream,
stdout: "pipe",
env: bunEnv,
});
const text = await new Response(proc.stdout).text();
expect(text).toBe("");
expect(await proc.exited).toBe(0);
});
test("ReadableStream with null bytes", async () => {
const stream = new ReadableStream({
start(controller) {
controller.enqueue(new Uint8Array([72, 101, 108, 108, 111, 0, 87, 111, 114, 108, 100])); // "Hello\0World"
controller.close();
},
});
await using proc = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream,
stdout: "pipe",
env: bunEnv,
});
const buffer = await new Response(proc.stdout).arrayBuffer();
const bytes = new Uint8Array(buffer);
expect(bytes).toEqual(new Uint8Array([72, 101, 108, 108, 111, 0, 87, 111, 114, 108, 100]));
expect(await proc.exited).toBe(0);
});
test("ReadableStream with transform stream", async () => {
// Create a transform stream that uppercases text
const upperCaseTransform = new TransformStream({
transform(chunk, controller) {
controller.enqueue(chunk.toUpperCase());
},
});
const originalStream = new ReadableStream({
start(controller) {
controller.enqueue("hello ");
controller.enqueue("world");
controller.close();
},
});
const transformedStream = originalStream.pipeThrough(upperCaseTransform);
await using proc = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: transformedStream,
stdout: "pipe",
env: bunEnv,
});
const text = await new Response(proc.stdout).text();
expect(text).toBe("HELLO WORLD");
expect(await proc.exited).toBe(0);
});
test("ReadableStream with tee", async () => {
const originalStream = new ReadableStream({
start(controller) {
controller.enqueue("shared data");
controller.close();
},
});
const [stream1, stream2] = originalStream.tee();
// Use the first branch for the process
await using proc = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream1,
stdout: "pipe",
env: bunEnv,
});
// Read from the second branch independently
const text2 = await new Response(stream2).text();
const text1 = await new Response(proc.stdout).text();
expect(text1).toBe("shared data");
expect(text2).toBe("shared data");
expect(await proc.exited).toBe(0);
});
test("ReadableStream object type count", async () => {
const iterations =
isASAN && isCI
? // With ASAN, entire process gets killed, including the test runner in CI. Likely an OOM or out of file descriptors.
10
: 50;
async function main() {
async function iterate(i: number) {
const stream = new ReadableStream({
async pull(controller) {
await Bun.sleep(0);
controller.enqueue(`iteration ${i}`);
controller.close();
},
});
await using proc = spawn({
cmd: [bunExe(), "-e", "process.stdin.pipe(process.stdout)"],
stdin: stream,
stdout: "pipe",
stderr: "inherit",
env: bunEnv,
});
await Promise.all([new Response(proc.stdout).text(), proc.exited]);
}
const promises = Array.from({ length: iterations }, (_, i) => iterate(i));
await Promise.all(promises);
}
await main();
await Bun.sleep(1);
Bun.gc(true);
await Bun.sleep(1);
// Check that we're not leaking objects
await expectMaxObjectTypeCount(expect, "ReadableStream", 10);
await expectMaxObjectTypeCount(expect, "Subprocess", 5);
});
});

View File

@@ -185,6 +185,10 @@ test/js/bun/spawn/job-object-bug.test.ts
test/js/bun/spawn/spawn-empty-arrayBufferOrBlob.test.ts
test/js/bun/spawn/spawn-path.test.ts
test/js/bun/spawn/spawn-stdin-destroy.test.ts
test/js/bun/spawn/spawn-stdin-readable-stream-edge-cases.test.ts
test/js/bun/spawn/spawn-stdin-readable-stream-integration.test.ts
test/js/bun/spawn/spawn-stdin-readable-stream-sync.test.ts
test/js/bun/spawn/spawn-stdin-readable-stream.test.ts
test/js/bun/spawn/spawn-stream-serve.test.ts
test/js/bun/spawn/spawn-streaming-stdout.test.ts
test/js/bun/spawn/spawn-stress.test.ts