12 Commits

Author SHA1 Message Date
Jarred Sumner
1bfe5c6b37 feat(md): Zig markdown parser with Bun.markdown API (#26440)
## Summary

- Port md4c (CommonMark-compliant markdown parser) from C to Zig under
`src/md/`
- Three output modes:
  - `Bun.markdown.html(input, options?)` — render to HTML string
- `Bun.markdown.render(input, callbacks?)` — render with custom
callbacks for each element
- `Bun.markdown.react(input, options?)` — render to a React Fragment
element, directly usable as a component return value
- React element creation uses a cached JSC Structure with
`putDirectOffset` for fast allocation
- Component overrides in `react()`: pass tag names as options keys to
replace default HTML elements with custom components
- GFM extensions: tables, strikethrough, task lists, permissive
autolinks, disallowed raw HTML tag filter
- Wire up `.md` as a bundler loader (via explicit `{ type: "md" }`)

## JavaScript API

### `Bun.markdown.html(input, options?)`

Renders markdown to an HTML string:

```js
const html = Bun.markdown.html("# Hello **world**");
// "<h1>Hello <strong>world</strong></h1>\n"

Bun.markdown.html("## Hello", { headingIds: true });
// '<h2 id="hello">Hello</h2>\n'
```

### `Bun.markdown.render(input, callbacks?)`

Renders markdown with custom JavaScript callbacks for each element. Each
callback receives children as a string and optional metadata, and
returns a string:

```js
// Custom HTML with classes
const html = Bun.markdown.render("# Title\n\nHello **world**", {
  heading: (children, { level }) => `<h${level} class="title">${children}</h${level}>`,
  paragraph: (children) => `<p>${children}</p>`,
  strong: (children) => `<b>${children}</b>`,
});

// ANSI terminal output
const ansi = Bun.markdown.render("# Hello\n\n**bold**", {
  heading: (children) => `\x1b[1;4m${children}\x1b[0m\n`,
  paragraph: (children) => children + "\n",
  strong: (children) => `\x1b[1m${children}\x1b[22m`,
});

// Strip all formatting
const text = Bun.markdown.render("# Hello **world**", {
  heading: (children) => children,
  paragraph: (children) => children,
  strong: (children) => children,
});
// "Hello world"

// Return null to omit elements
const result = Bun.markdown.render("# Title\n\n![logo](img.png)\n\nHello", {
  image: () => null,
  heading: (children) => children,
  paragraph: (children) => children + "\n",
});
// "Title\nHello\n"
```

Parser options can be included alongside callbacks:

```js
Bun.markdown.render("Visit www.example.com", {
  link: (children, { href }) => `[${children}](${href})`,
  paragraph: (children) => children,
  permissiveAutolinks: true,
});
```

### `Bun.markdown.react(input, options?)`

Returns a React Fragment element — use it directly as a component return
value:

```tsx
// Use as a component
function Markdown({ text }: { text: string }) {
  return Bun.markdown.react(text);
}

// With custom components
function Heading({ children }: { children: React.ReactNode }) {
  return <h1 className="title">{children}</h1>;
}
const element = Bun.markdown.react("# Hello", { h1: Heading });

// Server-side rendering
import { renderToString } from "react-dom/server";
const html = renderToString(Bun.markdown.react("# Hello **world**"));
// "<h1>Hello <strong>world</strong></h1>"
```

#### React 18 and older

By default, `react()` uses `Symbol.for('react.transitional.element')` as
the `$$typeof` symbol, which is what React 19 expects. For React 18 and
older, pass `reactVersion: 18`:

```tsx
const el = Bun.markdown.react("# Hello", { reactVersion: 18 });
```

### Component Overrides

Tag names can be overridden in `react()`:

```tsx
Bun.markdown.react(input, {
  h1: MyHeading,      // block elements
  p: CustomParagraph,
  a: CustomLink,      // inline elements
  img: CustomImage,
  pre: CodeBlock,
  // ... h1-h6, p, blockquote, ul, ol, li, pre, hr, html,
  //     table, thead, tbody, tr, th, td,
  //     em, strong, a, img, code, del, math, u, br
});
```

Boolean values are ignored (not treated as overrides), so parser options
like `{ strikethrough: true }` don't conflict with component overrides.

### Options

```js
Bun.markdown.html(input, {
  tables: true,              // GFM tables (default: true)
  strikethrough: true,       // ~~deleted~~ (default: true)
  tasklists: true,           // - [x] items (default: true)
  headingIds: true,          // Generate id attributes on headings
  autolinkHeadings: true,    // Wrap heading content in <a> tags
  tagFilter: false,          // GFM disallowed HTML tags
  wikiLinks: false,          // [[wiki]] links
  latexMath: false,          // $inline$ and $$display$$
  underline: false,          // __underline__ (instead of <strong>)
  // ... and more
});
```

## Architecture

### Parser (`src/md/`)

The parser is split into focused modules using Zig's delegation pattern:

| Module | Purpose |
|--------|---------|
| `parser.zig` | Core `Parser` struct, state, and re-exported method
delegation |
| `blocks.zig` | Block-level parsing: document processing, line
analysis, block start/end |
| `containers.zig` | Container management: blockquotes, lists, list
items |
| `inlines.zig` | Inline parsing: emphasis, code spans, HTML tags,
entities |
| `links.zig` | Link/image resolution, reference links, autolink
rendering |
| `autolinks.zig` | Permissive autolink detection (www, url, email) |
| `line_analysis.zig` | Line classification: headings, fences, HTML
blocks, tables |
| `ref_defs.zig` | Reference definition parsing and lookup |
| `render_blocks.zig` | Block rendering dispatch (code, HTML, table
blocks) |
| `html_renderer.zig` | HTML renderer implementing `Renderer` VTable |
| `types.zig` | Shared types: `Renderer` VTable, `BlockType`,
`SpanType`, `TextType`, etc. |

### Renderer Abstraction

Parsing is decoupled from output via a `Renderer` VTable interface:

```zig
pub const Renderer = struct {
    ptr: *anyopaque,
    vtable: *const VTable,

    pub const VTable = struct {
        enterBlock: *const fn (...) void,
        leaveBlock: *const fn (...) void,
        enterSpan:  *const fn (...) void,
        leaveSpan:  *const fn (...) void,
        text:       *const fn (...) void,
    };
};
```

Four renderers are implemented:
- **`HtmlRenderer`** (`src/md/html_renderer.zig`) — produces HTML string
output
- **`JsCallbackRenderer`** (`src/bun.js/api/MarkdownObject.zig`) — calls
JS callbacks for each element, accumulates string output
- **`ParseRenderer`** (`src/bun.js/api/MarkdownObject.zig`) — builds
React element AST with `MarkedArgumentBuffer` for GC safety
- **`JSReactElement`** (`src/bun.js/bindings/JSReactElement.cpp`) — C++
fast path for React element creation using cached JSC Structure +
`putDirectOffset`

## Test plan

- [x] 792 spec tests pass (CommonMark, GFM tables, strikethrough,
tasklists, permissive autolinks, GFM tag filter, wiki links, coverage,
regressions)
- [x] 114 API tests pass (`html()`, `render()`, `react()`,
`renderToString` integration, component overrides)
- [x] 58 GFM compatibility tests pass

```
bun bd test test/js/bun/md/md-spec.test.ts       # 792 pass
bun bd test test/js/bun/md/md-render-api.test.ts  # 114 pass
bun bd test test/js/bun/md/gfm-compat.test.ts     # 58 pass
```

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Dylan Conway <dylan.conway567@gmail.com>
Co-authored-by: SUZUKI Sosuke <sosuke@bun.com>
Co-authored-by: robobun <robobun@oven.sh>
Co-authored-by: Claude Bot <claude-bot@bun.sh>
Co-authored-by: Kirill Markelov <kerusha.chubko@gmail.com>
Co-authored-by: Ciro Spaciari <ciro.spaciari@gmail.com>
Co-authored-by: Alistair Smith <hi@alistair.sh>
2026-01-28 20:24:02 -08:00
Dylan Conway
b59c77a6e7 feat: add native JSON5 parser (Bun.JSON5) (#26439)
## Summary

- Adds `Bun.JSON5.parse()` and `Bun.JSON5.stringify()` as built-in APIs
- Adds `.json5` file support in the module resolver and bundler
- Parser uses a scanner/parser split architecture with a labeled switch
pattern (like the YAML parser) — the scanner produces typed tokens, the
parser never touches source bytes directly
- 430+ tests covering the official JSON5 test suite, escape sequences,
numbers, comments, whitespace (including all Unicode whitespace types),
unquoted/reserved-word keys, unicode identifiers, deeply nested
structures, garbage input, error messages, and stringify behavior

<img width="659" height="610" alt="Screenshot 2026-01-25 at 12 19 57 AM"
src="https://github.com/user-attachments/assets/e300125a-f197-4cad-90ed-e867b6232a01"
/>

## Test plan

- [x] `bun bd test test/js/bun/json5/json5.test.ts` — 317 tests
- [x] `bun bd test test/js/bun/json5/json5-test-suite.test.ts` — 113
tests from the official JSON5 test suite
- [x] `bun bd test test/js/bun/resolve/json5/json5.test.js` — .json5
module resolution

closes #3175

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2026-01-26 10:52:35 -08:00
Jarred Sumner
e8b2455f11 feat: add Bun.JSONL.parse() for streaming newline-delimited JSON parsing (#26356)
Adds a built-in JSONL parser implemented in C++ using JavaScriptCore's
optimized JSON parser.

## API

### `Bun.JSONL.parse(input)`
Parse a complete JSONL string or `Uint8Array` and return an array of all
parsed values. Throws on invalid input.

```ts
const results = Bun.JSONL.parse('{"a":1}\n{"b":2}\n');
// [{ a: 1 }, { b: 2 }]
```

### `Bun.JSONL.parseChunk(input, start?, end?)`
Parse as many complete values as possible, returning `{ values, read,
done, error }`. Designed for streaming use cases where input arrives
incrementally.

```ts
const result = Bun.JSONL.parseChunk('{"id":1}\n{"id":2}\n{"id":3');
result.values; // [{ id: 1 }, { id: 2 }]
result.read;   // 17
result.done;   // false
result.error;  // null
```

## Implementation Details

- C++ implementation in `BunObject.cpp` using JSC's `streamingJSONParse`
- ASCII fast path: zero-copy `StringView` for pure ASCII input
- Non-ASCII: uses `fromUTF8ReplacingInvalidSequences` with
`utf16_length_from_utf8` size check to prevent overflow
- UTF-8 BOM automatically skipped for `Uint8Array` input
- Pre-built `Structure` with fixed property offsets for fast result
object creation
- `Symbol.toStringTag = "JSONL"` on the namespace object
- `parseChunk` returns errors in `error` property instead of throwing,
preserving partial results
- Comprehensive boundary checks on start/end parameters

## Tests

234 tests covering:
- Complete and partial/streaming input scenarios
- Error handling and recovery
- UTF-8 multi-byte characters and BOM handling
- start/end boundary security (exhaustive combinations, clamping, OOB
prevention)
- 4 GB input rejection (both ASCII and non-ASCII paths)
- Edge cases (empty input, single values, whitespace, special numbers)

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2026-01-23 00:23:25 -08:00
robobun
7076a49bb1 feat(archive): add TypeScript types, docs, and files() benchmark (#25922)
## Summary

- Add comprehensive TypeScript type definitions for `Bun.Archive` in
`bun.d.ts`
  - `ArchiveInput` and `ArchiveCompression` types
- Full JSDoc documentation with examples for all methods (`from`,
`write`, `extract`, `blob`, `bytes`, `files`)
- Add documentation page at `docs/runtime/archive.mdx`
  - Quickstart examples
  - Creating and extracting archives
  - `files()` method with glob filtering
  - Compression support
  - Full API reference section
- Add Archive to docs sidebar under "Data & Storage"
- Add `files()` benchmark comparing `Bun.Archive.files()` vs node-tar
- Shows ~7x speedup for reading archive contents into memory (59µs vs
434µs)

## Test plan

- [x] TypeScript types compile correctly
- [x] Documentation renders properly in Mintlify format
- [x] Benchmark runs successfully and shows performance comparison
- [x] Verified `files()` method works correctly with both Bun.Archive
and node-tar

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Bot <claude-bot@bun.sh>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Jarred Sumner <jarred@jarredsumner.com>
2026-01-09 19:00:19 -08:00
Lydia Hallie
dce7a02f4d Docs: Minor fixes and improvements (#25284)
This PR addresses several issues opened for the docs:

- Add callout for SQLite caching behavior between prepare() and query()
- Fix SQLite types and fix deprecated exec to run
- Fix Secrets API example
- Update SolidStart guide
- Add bun upgrade guide
- Prefer `process.versions.bun` over `typeof Bun` for detection
- Document complete `bunx` flags
- Improve Nitro preset documentation for Nuxt

Fixes #23165, #24424, #24294, #25175, #18433, #16804, #22967, #22527,
#10560, #14744
2025-12-01 13:32:08 -08:00
Michael H
4450d738fa docs: more consistency + minor updates (#24764)
Co-authored-by: RiskyMH <git@riskymh.dev>
2025-11-21 14:06:19 -08:00
Michael H
87eca6bbc7 docs: re-apply many recent changes that somehow aren't present (#24719)
lots of recent changes aren't present, so this reapplies them
2025-11-16 19:23:01 +11:00
Caio Borghi
fff47f0267 docs: update EdgeDB references to Gel rebrand (#24487)
## Summary
EdgeDB has rebranded to Gel. This PR comprehensively updates all
documentation to reflect the rebrand.

## Changes Made

### Documentation & Branding
- **Guide title**: "Use EdgeDB with Bun" → "Use Gel with Bun"
- **File renamed**: `docs/guides/ecosystem/edgedb.mdx` → `gel.mdx`
- **Description**: Added "(formerly EdgeDB)" note
- **All path references**: Updated from `/guides/ecosystem/edgedb` to
`/guides/ecosystem/gel`

### CLI Commands
- `edgedb project init` → `gel project init`
- `edgedb` → `gel` (REPL)
- `edgedb migration create` → `gel migration create`
- `edgedb migrate` → `gel migrate`

### npm Packages
- `edgedb` → `gel`
- `@edgedb/generate` → `@gel/generate`

### Installation & Documentation URLs
- Installation link: `docs.geldata.com/learn/installation` (functional)
- Documentation reference: `docs.geldata.com/` (operational)
- Installation scripts: Verified working (`https://www.geldata.com/sh`
and `ps1`)
- Added Homebrew option: `brew install geldata/tap/gel-cli`

### Code Examples
- Updated all imports: `import { createClient } from "gel"`
- Updated codegen commands: `bunx @gel/generate`

## Verified
All commands verified against official Gel documentation at
https://docs.geldata.com/

Fixes #17721

---------

Co-authored-by: Lydia Hallie <lydiajuliettehallie@gmail.com>
2025-11-12 14:18:59 -08:00
Lydia Hallie
6b70b71895 Add TanStack Start guide (#24572)
Adds a guide on how to build and deploy TanStack Start with Bun
2025-11-10 17:38:48 -08:00
Lydia Hallie
35a57ff008 Add Upstash guide with Bun Redis (#24493) 2025-11-07 18:32:19 -08:00
Lydia Hallie
1896c75d78 Add deploy guides for AWS Lambda, Google Run, DigitalOcean (#24414)
Adds deployment guides for Bun apps on AWS Lambda, Google Cloud Run, and
DigitalOcean using a custom `Dockerfile`

---------

Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-11-07 15:03:36 -08:00
Lydia Hallie
1606a9f24e Replace old docs with new docs repo (#24201) 2025-11-05 11:14:21 -08:00