7 Commits

Author SHA1 Message Date
Jarred Sumner
cd4d98338c some benchmarks 2026-01-29 12:38:05 -08:00
SUZUKI Sosuke
44df912d37 Add Bun.wrapAnsi() for text wrapping with ANSI escape code preservation (#26061)
## Summary

Adds `Bun.wrapAnsi()`, a native implementation of the popular
[wrap-ansi](https://www.npmjs.com/package/wrap-ansi) npm package for
wrapping text with ANSI escape codes.

## API

```typescript
Bun.wrapAnsi(string: string, columns: number, options?: WrapAnsiOptions): string

interface WrapAnsiOptions {
  hard?: boolean;              // default: false - Break words longer than columns
  wordWrap?: boolean;          // default: true - Wrap at word boundaries
  trim?: boolean;              // default: true - Trim leading/trailing whitespace
  ambiguousIsNarrow?: boolean; // default: true - Treat ambiguous-width chars as narrow
}
```

## Features

- Wraps text to fit within specified column width
- Preserves ANSI escape codes (SGR colors/styles)
- Supports OSC 8 hyperlinks
- Respects Unicode display widths (full-width characters, emoji)
- Normalizes `\r\n` to `\n`

## Implementation Details

The implementation closes and reopens ANSI codes around line breaks for
robust terminal compatibility. This differs slightly from the npm
package in edge cases but produces visually equivalent output.

### Behavioral Differences from npm wrap-ansi

1. **ANSI code preservation**: Bun always maintains complete ANSI escape
sequences. The npm version can output malformed codes (missing ESC
character) in certain edge cases with `wordWrap: false, trim: false`.

2. **Newline ANSI handling**: Bun closes and reopens ANSI codes around
newlines for robustness. The npm version sometimes keeps them spanning
across newlines. The visual output is equivalent.

## Tests

- 27 custom tests covering basic functionality, ANSI codes, Unicode, and
options
- 23 tests ported from the npm package (MIT licensed, credited in file
header)
- All 50 tests pass

## Benchmark

<!-- Benchmark results will be added -->
```
$ cd /Users/sosuke/code/bun/bench && ../build/release/bun snippets/wrap-ansi.js
clk: ~3.82 GHz
cpu: Apple M4 Max
runtime: bun 1.3.7 (arm64-darwin)

benchmark                    avg (min … max) p75   p99    (min … top 1%)
-------------------------------------------- -------------------------------
Short text (45 chars) - npm    25.81 µs/iter  21.71 µs  █
                      (16.79 µs … 447.38 µs) 110.96 µs ▆█▃▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
Short text (45 chars) - Bun   685.55 ns/iter 667.00 ns    █
                       (459.00 ns … 2.16 ms)   1.42 µs ▁▁▁█▃▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁

summary
  Short text (45 chars) - Bun
   37.65x faster than Short text (45 chars) - npm

-------------------------------------------- -------------------------------
Medium text (810 chars) - npm 568.12 µs/iter 578.00 µs  ▄▅█▆▆▃
                     (525.25 µs … 944.71 µs) 700.75 µs ▄██████▆▅▄▃▃▂▂▂▁▁▁▁▁▁
Medium text (810 chars) - Bun  11.22 µs/iter  11.28 µs                     █
                       (11.04 µs … 11.46 µs)  11.33 µs █▁▁▁██▁█▁▁▁▁█▁█▁▁█▁▁█

summary
  Medium text (810 chars) - Bun
   50.62x faster than Medium text (810 chars) - npm

-------------------------------------------- -------------------------------
Long text (8100 chars) - npm    7.66 ms/iter   7.76 ms     ▂▂▅█   ▅
                         (7.31 ms … 8.10 ms)   8.06 ms ▃▃▄▃█████▇▇███▃▆▆▆▄▁▃
Long text (8100 chars) - Bun  112.14 µs/iter 113.50 µs        █
                     (102.50 µs … 146.04 µs) 124.92 µs ▁▁▁▁▁▁██▇▅█▃▂▂▂▂▁▁▁▁▁

summary
  Long text (8100 chars) - Bun
   68.27x faster than Long text (8100 chars) - npm

-------------------------------------------- -------------------------------
Colored short - npm            28.46 µs/iter  28.56 µs              █
                       (27.90 µs … 29.34 µs)  28.93 µs ▆▁▆▁▁▆▁▁▆▆▆▁▆█▁▁▁▁▁▁▆
Colored short - Bun           861.64 ns/iter 867.54 ns         ▂  ▇█▄▂
                     (839.68 ns … 891.12 ns) 882.04 ns ▃▅▄▅▆▆▇▆██▇████▆▃▅▅▅▂

summary
  Colored short - Bun
   33.03x faster than Colored short - npm

-------------------------------------------- -------------------------------
Colored medium - npm          557.84 µs/iter 562.63 µs      ▂▃█▄
                     (508.08 µs … 911.92 µs) 637.96 µs ▁▁▁▂▄█████▅▂▂▁▁▁▁▁▁▁▁
Colored medium - Bun           14.91 µs/iter  14.94 µs ██  ████ ██ █      ██
                       (14.77 µs … 15.17 µs)  15.06 µs ██▁▁████▁██▁█▁▁▁▁▁▁██

summary
  Colored medium - Bun
   37.41x faster than Colored medium - npm

-------------------------------------------- -------------------------------
Colored long - npm              7.84 ms/iter   7.90 ms       █  ▅
                         (7.53 ms … 8.38 ms)   8.19 ms ▂▂▂▄▃▆██▇██▇▃▂▃▃▃▄▆▂▂
Colored long - Bun            176.73 µs/iter 175.42 µs       █
                       (162.50 µs … 1.37 ms) 204.46 µs ▁▁▂▄▇██▅▂▂▂▁▁▁▁▁▁▁▁▁▁

summary
  Colored long - Bun
   44.37x faster than Colored long - npm

-------------------------------------------- -------------------------------
Hard wrap long - npm            8.05 ms/iter   8.12 ms       ▃ ▇█
                         (7.67 ms … 8.53 ms)   8.50 ms ▄▁▁▁▃▄█████▄▃▂▆▄▃▂▂▂▂
Hard wrap long - Bun          111.85 µs/iter 112.33 µs         ▇█
                     (101.42 µs … 145.42 µs) 123.88 µs ▁▁▁▁▁▁▁████▄▃▂▂▂▁▁▁▁▁

summary
  Hard wrap long - Bun
   72.01x faster than Hard wrap long - npm

-------------------------------------------- -------------------------------
Hard wrap colored - npm         8.82 ms/iter   8.92 ms   ▆ ██
                         (8.55 ms … 9.47 ms)   9.32 ms ▆▆████▆▆▄▆█▄▆▄▄▁▃▁▃▄▃
Hard wrap colored - Bun       174.38 µs/iter 175.54 µs   █ ▂
                     (165.75 µs … 210.25 µs) 199.50 µs ▁▃█▆███▃▂▃▂▂▂▂▂▁▁▁▁▁▁

summary
  Hard wrap colored - Bun
   50.56x faster than Hard wrap colored - npm

-------------------------------------------- -------------------------------
Japanese (full-width) - npm    51.00 µs/iter  52.67 µs    █▂   █▄
                      (40.71 µs … 344.88 µs)  66.13 µs ▁▁▃██▄▃▅██▇▄▃▄▃▂▂▁▁▁▁
Japanese (full-width) - Bun     7.46 µs/iter   7.46 µs       █
                        (6.50 µs … 34.92 µs)   9.38 µs ▁▁▁▁▁██▆▂▁▂▁▁▁▁▁▁▁▁▁▁

summary
  Japanese (full-width) - Bun
   6.84x faster than Japanese (full-width) - npm

-------------------------------------------- -------------------------------
Emoji text - npm              173.63 µs/iter 222.17 µs   █
                     (129.42 µs … 527.25 µs) 249.58 µs ▁▃█▆▃▃▃▁▁▁▁▁▁▁▂▄▆▄▂▂▁
Emoji text - Bun                9.42 µs/iter   9.47 µs           ██
                         (9.32 µs … 9.52 µs)   9.50 µs █▁▁███▁▁█▁██▁▁▁▁██▁▁█

summary
  Emoji text - Bun
   18.44x faster than Emoji text - npm

-------------------------------------------- -------------------------------
Hyperlink (OSC 8) - npm       208.00 µs/iter 254.25 µs   █
                     (169.58 µs … 542.17 µs) 281.00 µs ▁▇█▃▃▂▂▂▁▁▁▁▁▁▁▃▃▅▃▂▁
Hyperlink (OSC 8) - Bun         6.00 µs/iter   6.06 µs      █           ▄
                         (5.88 µs … 6.11 µs)   6.10 µs ▅▅▅▁▅█▅▁▅▁█▁▁▅▅▅▅█▅▁█

summary
  Hyperlink (OSC 8) - Bun
   34.69x faster than Hyperlink (OSC 8) - npm

-------------------------------------------- -------------------------------
No trim long - npm              8.32 ms/iter   8.38 ms  █▇
                        (7.61 ms … 13.67 ms)  11.74 ms ▃████▄▂▃▂▂▃▁▁▁▁▁▁▁▁▁▂
No trim long - Bun             93.92 µs/iter  94.42 µs           █▂
                      (82.75 µs … 162.38 µs) 103.83 µs ▁▁▁▁▁▁▁▁▄███▄▃▂▂▁▁▁▁▁

summary
  No trim long - Bun
   88.62x faster than No trim long - npm
```

---------

Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2026-01-16 16:12:23 -08:00
robobun
70fa6af355 feat: add Bun.Archive API for creating and extracting tarballs (#25665)
## Summary

- Adds new `Bun.Archive` API for working with tar archives
- `Bun.Archive.from(data)` - Create archive from object, Blob,
TypedArray, or ArrayBuffer
- `Bun.Archive.write(path, data, compress?)` - Write archive to disk
(async)
- `archive.extract(path)` - Extract to directory, returns
`Promise<number>` (file count)
- `archive.blob(compress?)` - Get archive as Blob (async)
- `archive.bytes(compress?)` - Get archive as Uint8Array (async)

Key implementation details:
- Uses existing libarchive bindings for tarball creation/extraction via
`extractToDisk`
- Uses libdeflate for gzip compression
- Immediate byte copying for GC safety (no JSValue protection, no
`hasPendingActivity`)
- Async operations run on worker pool threads with proper VM reference
handling
- Growing memory buffer via `archive_write_open2` callbacks for
efficient tarball creation

## Test plan

- [x] 65 comprehensive tests covering:
  - Normal operations (create, extract, blob, bytes, write)
  - GC safety (unreferenced archives, mutation isolation)  
  - Error handling (invalid args, corrupted data, I/O errors)
- Edge cases (large files, many files, special characters, path
normalization)
  - Concurrent operations

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Bot <claude-bot@bun.sh>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Dylan Conway <dylan.conway567@gmail.com>
Co-authored-by: Jarred Sumner <jarred@jarredsumner.com>
2026-01-09 00:33:35 -08:00
Jarred Sumner
98cee5a57e Improve Bun.stringWidth accuracy and robustness (#25447)
This PR significantly improves `Bun.stringWidth` to handle a wider
variety of Unicode characters and escape sequences correctly.

## Zero-width character handling

Added support for many previously unhandled zero-width characters:
- Soft hyphen (U+00AD)
- Word joiner and invisible operators (U+2060-U+2064)
- Lone surrogates (U+D800-U+DFFF)
- Arabic formatting characters (U+0600-U+0605, U+06DD, U+070F, U+08E2)
- Indic script combining marks (Devanagari through Malayalam)
- Thai and Lao combining marks
- Combining Diacritical Marks Extended and Supplement
- Tag characters (U+E0000-U+E007F)

## ANSI escape sequence handling

### CSI sequences
- Now properly handles ALL CSI final bytes (0x40-0x7E), not just `m`
- This means cursor movement (A/B/C/D), erase (J/K), scroll (S/T), and
other CSI commands are now correctly excluded from width calculation

### OSC sequences
- Added support for OSC sequences (ESC ] ... BEL/ST)
- OSC 8 hyperlinks are now properly handled
- Supports both BEL (0x07) and ST (ESC \) terminators

### ESC ESC fix
- Fixed state machine bug where `ESC ESC` would incorrectly reset state
- Now correctly handles consecutive ESC characters

## Emoji handling

Added proper grapheme-aware emoji width calculation:
- Flag emoji (regional indicator pairs) → width 2
- Skin tone modifiers → width 2
- ZWJ sequences (family, professions, etc.) → width 2
- Keycap sequences → width 2
- Variation selectors (VS15 for text, VS16 for emoji presentation)
- Uses ICU's `UCHAR_EMOJI` property for accurate emoji detection

## Test coverage

Added comprehensive test suite with **94 tests** covering:
- All zero-width character categories
- All CSI final bytes
- OSC sequences with various terminators
- Emoji edge cases (flags, skin tones, ZWJ, keycaps, variation
selectors)
- East Asian width (CJK, fullwidth, halfwidth katakana)
- Indic and Thai script combining marks
- Fuzzer-like stress tests for robustness

## Breaking changes

This is a behavior change - `stringWidth` will return different values
for some inputs. However, the new values are more accurate
representations of terminal display width:

| Input | Old | New | Why |
|-------|-----|-----|-----|
| Flag emoji 🇺🇸 | 1 | 2 | Flags display as 2 cells |
| Skin tone 👋🏽 | 4 | 2 | Emoji + modifier = 1 grapheme |
| ZWJ family 👨‍👩‍👧 | 8 | 2 | ZWJ sequence = 1 grapheme |
| Word joiner U+2060 | 1 | 0 | Invisible character |
| OSC 8 hyperlinks | counted URL | just visible text | URLs are
invisible |
| Cursor movement ESC[5A | counted | 0 | Control sequence |

🤖 Generated with [Claude Code](https://claude.ai/code)

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Claude Bot <claude-bot@bun.sh>
2025-12-10 16:17:57 -08:00
Jarred Sumner
4fa69773a3 Introduce Bun.stripANSI (#21801)
### What does this PR do?

Introduce `Bun.stripANSI`, a SIMD-accelerated drop-in replacement for
the popular `"strip-ansi"` package.

`Bun.stripANSI` performs >10x faster and fixes several bugs in
`strip-ansi`, like [this long-standing
one](https://github.com/chalk/strip-ansi/issues/43).

### How did you verify your code works?

There are tests that check the output of `strip-ansi` matches
`Bun.stripANSI`. For cases where `strip-ansi`'s behavior is incorrect,
the expected value is manually provided.

---------

Co-authored-by: Jarred-Sumner <709451+Jarred-Sumner@users.noreply.github.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Claude Bot <claude-bot@bun.sh>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: taylor.fish <contact@taylor.fish>
2025-08-14 22:42:05 -07:00
Meghan Denny
93a89e5866 meta: update bun.locks with bun 1.2 (#16867)
Co-authored-by: nektro <5464072+nektro@users.noreply.github.com>
2025-01-29 01:47:43 -08:00
Meghan Denny
5971406183 meta: migrate bun to the text lockfile (#16462)
Co-authored-by: nektro <5464072+nektro@users.noreply.github.com>
2025-01-17 22:37:26 +00:00