Jarred Sumner
86d4d87beb
feat(unicode): migrate grapheme breaking to uucode with GB9c support ( #26376 )
...
## Summary
Replace Bun's outdated grapheme breaking implementation with [Ghostty's
approach](https://github.com/ghostty-org/ghostty/tree/main/src/unicode )
using the [uucode](https://github.com/jacobsandlund/uucode ) library.
This adds proper **GB9c (Indic Conjunct Break)** support — Devanagari
and other Indic script conjuncts now correctly form single grapheme
clusters.
## Motivation
The previous implementation used a `GraphemeBoundaryClass` enum with
only 12 values and a 2-bit `BreakState` (just `extended_pictographic`
and `regional_indicator` flags). It had no support for Unicode's GB9c
rule, meaning Indic conjunct sequences (consonant + virama + consonant)
were incorrectly split into multiple grapheme clusters.
## Architecture
### Runtime (zero uucode dependency, two table lookups)
```
codepoint → [3-level LUT] → GraphemeBreakNoControl enum (u5, 17 values)
(state, gb1, gb2) → [8KB precomputed array] → (break_result, new_state)
```
The full grapheme break algorithm (GB6-GB13, GB9c, GB11, GB999) runs
only at **comptime** to populate the 8KB decision array. At runtime it's
pure table lookups.
### File Layout
```
src/deps/uucode/ ← Vendored library (MIT, build-time only)
src/unicode/uucode/ ← Build-time integration
├── uucode_config.zig ← What Unicode properties to generate
├── grapheme_gen.zig ← Generator: queries uucode → writes tables
├── lut.zig ← 3-level lookup table generator
└── CLAUDE.md ← Maintenance docs
src/string/immutable/ ← Runtime (no uucode dependency)
├── grapheme.zig ← Grapheme break API + comptime decisions
├── grapheme_tables.zig ← Pre-generated tables (committed, ~91KB source)
└── visible.zig ← Width calculation (2 lines changed)
scripts/update-uucode.sh ← Update vendored uucode + regenerate
```
### Key Types
| Type | Size | Values |
|------|------|--------|
| `GraphemeBreakNoControl` | u5 | 17 (adds
`indic_conjunct_break_{consonant,linker,extend}`, `emoji_modifier_base`,
`zwnj`, etc.) |
| `BreakState` | u3 | 5 (`default`, `regional_indicator`,
`extended_pictographic`, `indic_conjunct_break_consonant`,
`indic_conjunct_break_linker`) |
### Binary Size
The tables store only the `GraphemeBreakNoControl` enum per codepoint
(not width or emoji properties, which visible.zig handles separately):
- stage1: 8192 × u16 = **16KB** (maps high byte → stage2 offset)
- stage2: 27392 × u8 = **27KB** (maps to stage3 index; max value is 16)
- stage3: 17 × u5 = **~17 bytes** (one per enum value)
- Precomputed decisions: **8KB**
- **Total: ~51KB** (vs previous ~70KB+)
## How to Regenerate Tables
```bash
# After updating src/deps/uucode/:
./scripts/update-uucode.sh
# Or manually:
vendor/zig/zig build generate-grapheme-tables
```
Normal builds never run the generator — they use the committed
`grapheme_tables.zig`.
## Testing
```bash
bun bd test test/js/bun/util/stringWidth.test.ts
```
New test cases verify Devanagari conjuncts (GB9c):
- `क्ष` (Ka+Virama+Ssa) → single cluster, width 2
- `क्ष` (Ka+Virama+ZWJ+Ssa) → single cluster, width 2
- `क्क्क` (Ka+Virama+Ka+Virama+Ka) → single cluster, width 3
---------
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2026-01-23 00:07:06 -08:00
pfg
05d0475c6c
Update to zig 0.15.2 ( #24204 )
...
Fixes ENG-21287
Build times, from `bun run build && echo '//' >> src/main.zig && time
bun run build`
|Platform|0.14.1|0.15.2|Speedup|
|-|-|-|-|
|macos debug asan|126.90s|106.27s|1.19x|
|macos debug noasan|60.62s|50.85s|1.19x|
|linux debug asan|292.77s|241.45s|1.21x|
|linux debug noasan|146.58s|130.94s|1.12x|
|linux debug use_llvm=false|n/a|78.27s|1.87x|
|windows debug asan|177.13s|142.55s|1.24x|
Runtime performance:
- next build memory usage may have gone up by 5%. Otherwise seems the
same. Some code with writers may have gotten slower, especially one
instance of a counting writer and a few instances of unbuffered writers
that now have vtable overhead.
- File size reduced by 800kb (from 100.2mb to 99.4mb)
Improvements:
- `@export` hack is no longer needed for watch
- native x86_64 backend for linux builds faster. to use it, set use_llvm
false and no_link_obj false. also set `ASAN_OPTIONS=detect_leaks=0`
otherwise it will spam the output with tens of thousands of lines of
debug info errors. may need to use the zig lldb fork for debugging.
- zig test-obj, which we will be able to use for zig unit tests
Still an issue:
- false 'dependency loop' errors remain in watch mode
- watch mode crashes observed
Follow-up:
- [ ] search `comptime Writer: type` and `comptime W: type` and remove
- [ ] remove format_mode in our zig fork
- [ ] remove deprecated.zig autoFormatLabelFallback
- [ ] remove deprecated.zig autoFormatLabel
- [ ] remove deprecated.BufferedWriter and BufferedReader
- [ ] remove override_no_export_cpp_apis as it is no longer needed
- [ ] css Parser(W) -> Parser, and remove all the comptime writer: type
params
- [ ] remove deprecated writer fully
Files that add lines:
```
649 src/deprecated.zig
167 scripts/pack-codegen-for-zig-team.ts
54 scripts/cleartrace-impl.js
46 scripts/cleartrace.ts
43 src/windows.zig
18 src/fs.zig
17 src/bun.js/ConsoleObject.zig
16 src/output.zig
12 src/bun.js/test/debug.zig
12 src/bun.js/node/node_fs.zig
8 src/env_loader.zig
7 src/css/printer.zig
7 src/cli/init_command.zig
7 src/bun.js/node.zig
6 src/string/escapeRegExp.zig
6 src/install/PnpmMatcher.zig
5 src/bun.js/webcore/Blob.zig
4 src/crash_handler.zig
4 src/bun.zig
3 src/install/lockfile/bun.lock.zig
3 src/cli/update_interactive_command.zig
3 src/cli/pack_command.zig
3 build.zig
2 src/Progress.zig
2 src/install/lockfile/lockfile_json_stringify_for_debugging.zig
2 src/css/small_list.zig
2 src/bun.js/webcore/prompt.zig
1 test/internal/ban-words.test.ts
1 test/internal/ban-limits.json
1 src/watcher/WatcherTrace.zig
1 src/transpiler.zig
1 src/shell/builtin/cp.zig
1 src/js_printer.zig
1 src/io/PipeReader.zig
1 src/install/bin.zig
1 src/css/selectors/selector.zig
1 src/cli/run_command.zig
1 src/bun.js/RuntimeTranspilerStore.zig
1 src/bun.js/bindings/JSRef.zig
1 src/bake/DevServer.zig
```
Files that remove lines:
```
-1 src/test/recover.zig
-1 src/sql/postgres/SocketMonitor.zig
-1 src/sql/mysql/MySQLRequestQueue.zig
-1 src/sourcemap/CodeCoverage.zig
-1 src/css/values/color_js.zig
-1 src/compile_target.zig
-1 src/bundler/linker_context/convertStmtsForChunk.zig
-1 src/bundler/bundle_v2.zig
-1 src/bun.js/webcore/blob/read_file.zig
-1 src/ast/base.zig
-2 src/sql/postgres/protocol/ArrayList.zig
-2 src/shell/builtin/mkdir.zig
-2 src/install/PackageManager/patchPackage.zig
-2 src/install/PackageManager/PackageManagerDirectories.zig
-2 src/fmt.zig
-2 src/css/declaration.zig
-2 src/css/css_parser.zig
-2 src/collections/baby_list.zig
-2 src/bun.js/bindings/ZigStackFrame.zig
-2 src/ast/E.zig
-3 src/StandaloneModuleGraph.zig
-3 src/deps/picohttp.zig
-3 src/deps/libuv.zig
-3 src/btjs.zig
-4 src/threading/Futex.zig
-4 src/shell/builtin/touch.zig
-4 src/meta.zig
-4 src/install/lockfile.zig
-4 src/css/selectors/parser.zig
-5 src/shell/interpreter.zig
-5 src/css/error.zig
-5 src/bun.js/web_worker.zig
-5 src/bun.js.zig
-6 src/cli/test_command.zig
-6 src/bun.js/VirtualMachine.zig
-6 src/bun.js/uuid.zig
-6 src/bun.js/bindings/JSValue.zig
-9 src/bun.js/test/pretty_format.zig
-9 src/bun.js/api/BunObject.zig
-14 src/install/install_binding.zig
-14 src/fd.zig
-14 src/bun.js/node/path.zig
-14 scripts/pack-codegen-for-zig-team.sh
-17 src/bun.js/test/diff_format.zig
```
`git diff --numstat origin/main...HEAD | awk '{ print ($1-$2)"\t"$3 }' |
sort -rn`
---------
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Dylan Conway <dylan.conway567@gmail.com >
Co-authored-by: Meghan Denny <meghan@bun.com >
Co-authored-by: tayor.fish <contact@taylor.fish >
2025-11-10 14:38:26 -08:00
Meghan Denny
0eb470fd88
zig: handle termination exception from promise fulfullment/rejection ( #23285 )
2025-10-14 19:48:25 -07:00
Jarred Sumner
ed9353f95e
gitignore the sources text files ( #22408 )
...
### What does this PR do?
### How did you verify your code works?
---------
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-09-04 14:59:35 -07:00
taylor.fish
437e15bae5
Replace catch bun.outOfMemory() with safer alternatives ( #22141 )
...
Replace `catch bun.outOfMemory()`, which can accidentally catch
non-OOM-related errors, with either `bun.handleOom` or a manual `catch
|err| switch (err)`.
(For internal tracking: fixes STAB-1070)
---------
Co-authored-by: Dylan Conway <dylan.conway567@gmail.com >
2025-08-26 12:50:25 -07:00
Meghan Denny
5b972fa2b4
zig: ban not using .true and .false for js boolean literals ( #21329 )
...
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Meghan Denny <meghan@bun.com >
2025-08-20 16:16:11 -07:00
pfg
7dd85f9dd4
fix toBeCloseTo missing incrementExpectCallCounter ( #21871 )
...
Fixes #11367 . Also enforces that all expect functions must use
incrementExpectCallCounter and migrates two from incrementing
active_test_expectation_counter manually
---------
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-08-14 17:02:58 -07:00
taylor.fish
e4beddb839
Reduce false negatives in ban-words.test.ts for undefined struct fields ( #21748 )
...
`ban-words.test.ts` attempts to detect places where a struct field is
given a default value of `undefined`, but it fails to detect cases like
the following:
```zig
foo: *Foo align(1) = undefined,
bar: [16 * 64]Bar = undefined,
baz: Baz(u8, true) = undefined,
```
This PR updates the check to detect more occurrences, while still
avoiding (as far as I can tell) the inclusion of any false positives.
(For internal tracking: fixes STAB-971)
---------
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-08-11 13:32:05 -07:00
pfg
89eb48047f
Auto reduce banned word count ( #20929 )
...
Co-authored-by: pfgithub <6010774+pfgithub@users.noreply.github.com >
2025-07-25 18:13:43 -07:00
pfg
83760fc446
Sort imports in all files ( #21119 )
...
Co-authored-by: taylor.fish <contact@taylor.fish >
2025-07-21 13:26:47 -07:00
taylor.fish
a1c0f74037
Simplify/fix threading utilities ( #21089 )
...
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-07-18 22:02:36 -07:00
Jarred Sumner
fdec7fc6e3
Simpler version of #20813 ( #21102 )
...
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
Co-authored-by: Jarred Sumner <Jarred-Sumner@users.noreply.github.com >
2025-07-16 00:14:33 -07:00
Meghan Denny
875604a42b
safety: a lot more exception checker progress ( #20956 )
2025-07-16 00:11:19 -07:00
Michael H
20db4b636e
implement bun pm pkg ( #21046 )
2025-07-15 22:14:00 -07:00
Jarred Sumner
89aae0bdc0
Add flag to disable sql auto pipelining ( #21067 )
...
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-07-15 01:13:35 -07:00
Jarred Sumner
8cf4df296d
Delete the merge conflict factory
...
There are many situations where using `catch unreachable` is a reasonable or sometimes necessary decision. This rule causes many, many merge conflicts.
2025-07-14 03:53:50 -07:00
Jarred Sumner
3f283680dd
Buffer stderr and stdout in bun:test reporting ( #21023 )
2025-07-14 00:55:35 -07:00
Michael H
8898c4c455
Vscode test runner support ( #20645 )
2025-07-13 21:57:44 -07:00
Meghan Denny
6c5b863530
safety: a lot more exception checker progress ( #20817 )
2025-07-10 15:34:51 -07:00
Dylan Conway
6e92f0b9cb
make number smaller
2025-07-09 19:22:57 -07:00
Dylan Conway
f24e8cb98a
implement "nodeLinker": "isolated" in bun install ( #20440 )
...
Co-authored-by: Jarred Sumner <jarred@jarredsumner.com >
2025-07-09 00:19:57 -07:00
Jarred Sumner
454316ffc3
Implement "node:module"'s findSourceMap and SourceMap class ( #20863 )
...
Co-authored-by: Claude <claude@anthropic.ai >
Co-authored-by: Claude <noreply@anthropic.com >
Co-authored-by: Jarred-Sumner <709451+Jarred-Sumner@users.noreply.github.com >
2025-07-07 23:08:12 -07:00
Ciro Spaciari
75902e6a21
fix(s3) fix loading http endpoint from env ( #20876 )
2025-07-07 19:24:32 -07:00
Michael H
764e20ee19
implement bun pm version ( #20706 )
2025-07-02 18:54:47 -07:00
190n
172aecb02e
[publish images] Upgrade self-reported Node.js version from 22.6.0 to 24.3.0 (v2) ( #20772 )
...
Co-authored-by: Jarred Sumner <jarred@jarredsumner.com >
Co-authored-by: Claude <noreply@anthropic.com >
Co-authored-by: Jarred-Sumner <709451+Jarred-Sumner@users.noreply.github.com >
Co-authored-by: Meghan Denny <meghan@bun.sh >
Co-authored-by: Ashcon Partovi <ashcon@partovi.net >
Co-authored-by: pfg <pfg@pfg.pw >
Co-authored-by: pfgithub <6010774+pfgithub@users.noreply.github.com >
Co-authored-by: Ciro Spaciari <ciro.spaciari@gmail.com >
Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com>
2025-07-02 12:06:08 -07:00
Ben Grant
ea57037567
Revert "Upgrade self-reported Node.js version from 22.6.0 to 24.3.0 ( #20659 ) [publish images]"
...
This reverts commit 80309e4d59 . It breaks the Windows CI.
2025-07-02 09:40:32 -07:00
Jarred Sumner
80309e4d59
Upgrade self-reported Node.js version from 22.6.0 to 24.3.0 ( #20659 ) [publish images]
...
Co-authored-by: Claude <noreply@anthropic.com >
Co-authored-by: Jarred-Sumner <709451+Jarred-Sumner@users.noreply.github.com >
Co-authored-by: Meghan Denny <meghan@bun.sh >
Co-authored-by: Ashcon Partovi <ashcon@partovi.net >
Co-authored-by: pfg <pfg@pfg.pw >
Co-authored-by: pfgithub <6010774+pfgithub@users.noreply.github.com >
Co-authored-by: Ciro Spaciari <ciro.spaciari@gmail.com >
Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com>
Co-authored-by: Ben Grant <ben@bun.sh >
2025-07-02 00:03:05 -07:00
Ciro Spaciari
964f2a8941
fix(fetch/s3) Handle backpressure when upload large bodies ( #20481 )
...
Co-authored-by: cirospaciari <6379399+cirospaciari@users.noreply.github.com >
2025-06-27 20:52:46 -07:00
Jarred Sumner
1d48f91b5e
Enable ReadableStream as stdin for Bun.spawn ( #20582 )
...
Co-authored-by: Jarred-Sumner <709451+Jarred-Sumner@users.noreply.github.com >
Co-authored-by: Cursor Agent <cursoragent@cursor.com >
Co-authored-by: jarred <jarred@bun.sh >
Co-authored-by: pfg <pfg@pfg.pw >
2025-06-27 19:42:03 -07:00
Jarred Sumner
3c1a1b5634
Implement per-message deflate support in WebSocket client ( #20613 )
...
Co-authored-by: Jarred-Sumner <709451+Jarred-Sumner@users.noreply.github.com >
2025-06-27 00:04:42 -07:00
Meghan Denny
b87cf4f247
zig: handle when coerceToInt32 and coerceToInt64 throw ( #20655 )
2025-06-25 20:44:13 -07:00
Meghan Denny
f9712ce309
make node:buffer,zlib,stream,fs exception checker clear ( #20494 )
2025-06-25 18:36:08 -07:00
Jarred Sumner
ba126fb330
Make node-gyp faster ( #20556 )
2025-06-24 21:07:01 -07:00
Michael H
282dda62c8
Add --import as alias to --preload for nodejs compat ( #20523 )
2025-06-20 19:58:54 -07:00
Jarred Sumner
633f4f593d
Ahead-of-time frontend bundling support for HTML imports & bun build ( #20511 )
...
Co-authored-by: Jarred-Sumner <709451+Jarred-Sumner@users.noreply.github.com >
Co-authored-by: Dylan Conway <dylan.conway567@gmail.com >
2025-06-20 19:55:52 -07:00
190n
346e97dde2
fix bugs found by exception scope verification ( #20285 )
...
Co-authored-by: 190n <7763597+190n@users.noreply.github.com >
2025-06-18 23:08:19 -07:00
Dylan Conway
3773ceeb7e
Remove PreparePatchPackageInstall ( #20457 )
2025-06-17 17:23:29 -07:00
Jarred Sumner
162a9b66d8
bun pm view -> bun info (#20419 )
...
Co-authored-by: Cursor Agent <cursoragent@cursor.com >
Co-authored-by: Jarred-Sumner <709451+Jarred-Sumner@users.noreply.github.com >
2025-06-17 13:31:16 -07:00
Jarred Sumner
978540902c
bun init CLAUDE.md (#20443 )
2025-06-17 13:00:28 -07:00
pfg
c44515eaaf
Support --unhandled-rejections flag and rejectionHandled event ( #19874 )
...
Co-authored-by: pfgithub <6010774+pfgithub@users.noreply.github.com >
2025-06-13 19:05:05 -07:00
Meghan Denny
62794850fa
zig: node:zlib: tidy ( #20362 )
...
Co-authored-by: nektro <5464072+nektro@users.noreply.github.com >
2025-06-13 19:43:35 +02:00
Meghan Denny
dedd433cbf
zig: prefer .jsUndefined() over .undefined for JSValue ( #20332 )
2025-06-12 13:18:46 -07:00
Jarred Sumner
6ebad50543
Introduce ahead of time bundling for HTML imports with bun build ( #20265 )
...
Co-authored-by: Jarred-Sumner <709451+Jarred-Sumner@users.noreply.github.com >
Co-authored-by: Dylan Conway <dylan.conway567@gmail.com >
Co-authored-by: dylan-conway <35280289+dylan-conway@users.noreply.github.com >
2025-06-10 21:26:00 -07:00
Jarred Sumner
8750f0b884
Add FileRoute for serving files ( #20198 )
...
Co-authored-by: Jarred-Sumner <709451+Jarred-Sumner@users.noreply.github.com >
Co-authored-by: Dylan Conway <dylan.conway567@gmail.com >
Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com>
Co-authored-by: Ciro Spaciari <ciro.spaciari@gmail.com >
2025-06-10 19:41:21 -07:00
Jarred Sumner
27abb51561
Add fallback for ADDRCONFIG like Chrome's, avoid glibc UDP port 0 hangs ( #19753 )
...
Co-authored-by: Jarred-Sumner <709451+Jarred-Sumner@users.noreply.github.com >
2025-06-05 20:20:00 -07:00
Dylan Conway
ce8767cdc8
add HTTPParser ( #20049 )
...
Co-authored-by: Jarred Sumner <jarred@jarredsumner.com >
2025-05-31 16:21:08 -07:00
Dylan Conway
5910504aeb
bun pm audit -> bun audit (#19944 )
2025-05-27 19:52:18 -07:00
Jarred Sumner
3ea6133c46
CI: Remove unused top-level decls in formatter in zig ( #19879 )
...
Co-authored-by: Jarred-Sumner <709451+Jarred-Sumner@users.noreply.github.com >
Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com>
2025-05-23 22:49:48 -07:00
Jarred Sumner
9d1eace981
Add bun pm view command ( #19841 )
...
Co-authored-by: Jarred-Sumner <709451+Jarred-Sumner@users.noreply.github.com >
2025-05-22 23:51:31 -07:00
Jarred Sumner
4ca83be84f
Add Zstd decompression to HTTP client ( #19800 )
...
Co-authored-by: Jarred-Sumner <709451+Jarred-Sumner@users.noreply.github.com >
Co-authored-by: Dylan Conway <35280289+dylan-conway@users.noreply.github.com >
2025-05-20 23:26:47 -07:00