Commit Graph

13 Commits

Author SHA1 Message Date
Dylan Conway
b59c77a6e7 feat: add native JSON5 parser (Bun.JSON5) (#26439)
## Summary

- Adds `Bun.JSON5.parse()` and `Bun.JSON5.stringify()` as built-in APIs
- Adds `.json5` file support in the module resolver and bundler
- Parser uses a scanner/parser split architecture with a labeled switch
pattern (like the YAML parser) — the scanner produces typed tokens, the
parser never touches source bytes directly
- 430+ tests covering the official JSON5 test suite, escape sequences,
numbers, comments, whitespace (including all Unicode whitespace types),
unquoted/reserved-word keys, unicode identifiers, deeply nested
structures, garbage input, error messages, and stringify behavior

<img width="659" height="610" alt="Screenshot 2026-01-25 at 12 19 57 AM"
src="https://github.com/user-attachments/assets/e300125a-f197-4cad-90ed-e867b6232a01"
/>

## Test plan

- [x] `bun bd test test/js/bun/json5/json5.test.ts` — 317 tests
- [x] `bun bd test test/js/bun/json5/json5-test-suite.test.ts` — 113
tests from the official JSON5 test suite
- [x] `bun bd test test/js/bun/resolve/json5/json5.test.js` — .json5
module resolution

closes #3175

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2026-01-26 10:52:35 -08:00
robobun
b135c207ed fix(yaml): remove YAML 1.1 legacy boolean values for YAML 1.2 compliance (#25537)
## Summary

- Remove YAML 1.1 legacy boolean values (`yes/no/on/off/y/Y`) that are
not part of the YAML 1.2 Core Schema
- Keep YAML 1.2 Core Schema compliant values: `true/True/TRUE`,
`false/False/FALSE`, `null/Null/NULL`, `0x` hex, `0o` octal
- Add comprehensive roundtrip tests for YAML 1.2 compliance

**Removed (now parsed as strings):**
- `yes`, `Yes`, `YES` (were `true`)
- `no`, `No`, `NO` (were `false`)
- `on`, `On`, `ON` (were `true`)
- `off`, `Off`, `OFF` (were `false`)
- `y`, `Y` (were `true`)

This fixes a common pain point where GitHub Actions workflow files with
`on:` keys would have the key parsed as boolean `true` instead of the
string `"on"`.

## YAML 1.2 Core Schema Specification

From [YAML 1.2.2 Section 10.3.2 Tag
Resolution](https://yaml.org/spec/1.2.2/#1032-tag-resolution):

| Regular expression | Resolved to tag |
|-------------------|-----------------|
| `null \| Null \| NULL \| ~` | tag:yaml.org,2002:null |
| `/* Empty */` | tag:yaml.org,2002:null |
| `true \| True \| TRUE \| false \| False \| FALSE` |
tag:yaml.org,2002:bool |
| `[-+]? [0-9]+` | tag:yaml.org,2002:int (Base 10) |
| `0o [0-7]+` | tag:yaml.org,2002:int (Base 8) |
| `0x [0-9a-fA-F]+` | tag:yaml.org,2002:int (Base 16) |
| `[-+]? ( \. [0-9]+ \| [0-9]+ ( \. [0-9]* )? ) ( [eE] [-+]? [0-9]+ )?`
| tag:yaml.org,2002:float |
| `[-+]? ( \.inf \| \.Inf \| \.INF )` | tag:yaml.org,2002:float
(Infinity) |
| `\.nan \| \.NaN \| \.NAN` | tag:yaml.org,2002:float (Not a number) |

Note: `yes`, `no`, `on`, `off`, `y`, `n` are **not** in the YAML 1.2
Core Schema boolean list. These were removed from YAML 1.1 as noted in
[YAML 1.2 Section 1.2](https://yaml.org/spec/1.2.2/#12-yaml-history):

> The YAML 1.2 specification was published in 2009. Its primary focus
was making YAML a strict superset of JSON. **It also removed many of the
problematic implicit typing recommendations.**

## Test plan

- [x] Updated existing YAML tests to reflect YAML 1.2 Core Schema
behavior
- [x] Added roundtrip tests (stringify → parse) for YAML 1.2 compliance
- [x] Verified tests fail with system Bun (YAML 1.1 behavior) and pass
with debug build (YAML 1.2)
- [x] Run `bun bd test test/js/bun/yaml/yaml.test.ts`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Bot <claude-bot@bun.sh>
Co-authored-by: Claude <noreply@anthropic.com>
2025-12-16 14:29:39 -08:00
robobun
a2d8b75962 fix(yaml): quote strings ending with colons (#25443)
## Summary
- Fixes strings ending with colons (e.g., `"tin:"`) not being quoted in
YAML.stringify output
- This caused YAML.parse to fail with "Unexpected token" when parsing
the output back

## Test plan
- Added regression tests in `test/regression/issue/25439.test.ts`
- Verified round-trip works for various strings ending with colons
- Ran existing YAML tests to ensure no regressions

Fixes #25439

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Bot <claude-bot@bun.sh>
Co-authored-by: Claude <noreply@anthropic.com>
2025-12-09 18:20:26 -08:00
Dylan Conway
0480d55a67 fix(YAML): handle exponential merge keys (#24729)
### What does this PR do?
Fixes ENG-21490
### How did you verify your code works?
Added a test that would previously fail due to timeout. It also confirms
the parsed result is correct.

---------

Co-authored-by: taylor.fish <contact@taylor.fish>
2025-11-19 21:20:55 -08:00
Dylan Conway
5908bfbfc6 fix(YAML.stringify): number-like strings prefixed with 0 (#24731)
### What does this PR do?
Ensures strings that would parse as a number with leading zeroes aren't
emitted without quotes.

fixes #23691

### How did you verify your code works?
Added a test
2025-11-14 17:43:36 -08:00
robobun
ebc0cfeacd fix(yaml): double-quoted strings with '...' incorrectly trigger document end error (#23491)
### What does this PR do?

Fixes #23489

The YAML parser was incorrectly treating `...` inside double-quoted
strings as document end markers, causing parse errors for strings
containing ellipsis, particularly affecting internationalized text.

### Example of the bug:
```yaml
balance: "👛 لا تمتلك محفظة... !"
```

This would fail with: `error: Unexpected document end`

### Root cause:

The bug was introduced in commit fcbd57ac48 which attempted to optimize
document marker detection by using `self.line_indent == .none` instead
of tracking newlines with a local flag. However, this check was
incomplete - it didn't track whether we had just processed a newline
character.

### The fix:

Restored the `nl` (newline) flag pattern from the single-quoted scanner
and combined it with the `line_indent` check. Document markers `...` and
`---` are now only recognized when **all** of these conditions are met:

1. We're after a newline (`nl == true`)
2. We're at column 0 (`self.line_indent == .none`)
3. Followed by whitespace or EOF

This allows `...` to appear freely in double-quoted strings while still
correctly recognizing actual document end markers at the start of lines.

### How did you verify your code works?

1. Reproduced the original issue from #23489
2. Applied the fix and verified all test cases pass:
   - Original Arabic text with emoji: `"👛 لا تمتلك محفظة... !"`
   - Various `...` positions: start, middle, end
   - Both single and double quotes
   - Multiline strings with indented `...` (issue #22392)
3. Created regression test in `test/regression/issue/23489.test.ts`
4. Verified existing YAML tests still pass (514 pass, up from 513)

cc @dylan-conway for review

---------

Co-authored-by: Claude Bot <claude-bot@bun.sh>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Jarred Sumner <jarred@jarredsumner.com>
Co-authored-by: Dylan Conway <dylan.conway567@gmail.com>
2025-10-20 14:19:22 -07:00
Dylan Conway
01924e9993 fix(YAML.stringify): check for more indicators at the beginning of strings (#23516)
### What does this PR do?
Makes sure strings are doubled quoted when they start with flow
indicators and `:`.

Fixes #23502
### How did you verify your code works?
Added tests for each indicator in flow and block context
2025-10-11 20:51:22 -07:00
Dylan Conway
dd08a707e2 update yaml-test-suite test generator script (#23277)
### What does this PR do?
Adds `expect().toBe()` checks for anchors/aliases. Also adds git commit
the tests were translated from.
### How did you verify your code works?
Manually
2025-10-05 18:58:26 -07:00
Dylan Conway
fcbd57ac48 Bring Bun.YAML to 90% passing yaml-test-suite (#23265)
### What does this PR do?
Fixes bugs in the parser bringing it to 90% passing the official
[yaml-test-suite](https://github.com/yaml/yaml-test-suite) (362/400
passing tests)

Still missing from our parser: |- and |+ (about 5%), and cyclic
references.

Translates the yaml-test-suite to our tests.

fixes #22659
fixes #22392
fixes #22286
### How did you verify your code works?
Added tests for yaml-test-suite and each of the linked issues

---------

Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-10-05 17:23:59 -07:00
robobun
fcd628424a Fix YAML.parse to throw SyntaxError instead of BuildMessage (#22924)
YAML.parse now throws SyntaxError for invalid syntax matching JSON.parse
behavior

---------

Co-authored-by: Claude Bot <claude-bot@bun.sh>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Dylan Conway <dylan.conway567@gmail.com>
2025-09-24 16:29:05 -07:00
Dylan Conway
fcaff77ed7 Implement Bun.YAML.stringify (#22183)
### What does this PR do?
This PR adds `Bun.YAML.stringify`. The stringifier will double quote
strings only when necessary (looks for keywords, numbers, or containing
non-printable or escaped characters). Anchors and aliases are detected
by object equality, and anchor name is chosen from property name, array
item, or the root collection.
```js
import { YAML } from "bun"

YAML.stringify(null) // null
YAML.stringify("hello YAML"); // "hello YAML"
YAML.stringify("123.456"); // "\"123.456\""

// anchors and aliases
const userInfo = { name: "bun" };
const obj = { user1: { userInfo }, user2: { userInfo } };
YAML.stringify(obj, null, 2);
// # output
// user1: 
//   userInfo: 
//     &userInfo
//     name: bun
// user2: 
//   userInfo: 
//     *userInfo

// will handle cycles
const obj = {};
obj.cycle = obj;
YAML.stringify(obj, null, 2);
// # output
// &root
// cycle:
//   *root

// default no space
const obj = { one: { two: "three" } };
YAML.stringify(obj);
// # output
// {one: {two: three}}
```

### How did you verify your code works?
Added tests for basic use and edgecases

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- New Features
- Added YAML.stringify to the YAML API, producing YAML from JavaScript
values with quoting, anchors, and indentation support.

- Improvements
- YAML.parse now accepts a wider range of inputs, including Buffer,
ArrayBuffer, TypedArrays, DataView, Blob/File, and SharedArrayBuffer,
with better error propagation and stack protection.

- Tests
- Extensive new tests for YAML.parse and YAML.stringify across data
types, edge cases, anchors/aliases, deep nesting, and round-trip
scenarios.

- Chores
- Added a YAML stringify benchmark script covering multiple libraries
and data shapes.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Bot <claude-bot@bun.sh>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-08-31 18:27:51 -07:00
Dylan Conway
a7586212eb fix(yaml): parsing strings that look like numbers (#22102)
### What does this PR do?
fixes parsing strings like `"1e18495d9d7f6b41135e5ee828ef538dc94f9be4"`

### How did you verify your code works?
added a test.
2025-08-24 14:06:39 -07:00
Dylan Conway
8fad98ffdb Add Bun.YAML.parse and YAML imports (#22073)
### What does this PR do?
This PR adds builtin YAML parsing with `Bun.YAML.parse`
```js
import { YAML } from "bun";
const items = YAML.parse("- item1");
console.log(items); // [ "item1" ]
```

Also YAML imports work just like JSON and TOML imports
```js
import pkg from "./package.yaml"
console.log({ pkg }); // { pkg: { name: "pkg", version: "1.1.1" } }
```
### How did you verify your code works?
Added some tests for YAML imports and parsed values.

---------

Co-authored-by: Claude Bot <claude-bot@bun.sh>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Jarred Sumner <jarred@jarredsumner.com>
2025-08-23 06:55:30 -07:00