mirror of
https://github.com/oven-sh/bun
synced 2026-02-12 11:59:00 +00:00
## Summary `Bun.stringWidth` was incorrectly treating Thai SARA AA (U+0E32), SARA AM (U+0E33), and their Lao equivalents (U+0EB2, U+0EB3) as zero-width characters. ## Root Cause In `src/string/immutable/visible.zig`, the range check for Thai/Lao combining marks was too broad: - Thai: `0xe31 <= cp <= 0xe3a` included U+0E32 and U+0E33 - Lao: `0xeb1 <= cp <= 0xebc` included U+0EB2 and U+0EB3 According to Unicode (UCD Grapheme_Break property), these are **spacing vowels** (Grapheme_Base), not combining marks. ## Changes - **`src/string/immutable/visible.zig`**: Exclude U+0E32, U+0E33, U+0EB2, U+0EB3 from zero-width ranges - **`test/js/bun/util/stringWidth.test.ts`**: Add tests for Thai and Lao spacing vowels ## Before/After | Character | Before | After | |-----------|--------|-------| | `\u0E32` (SARA AA) | 0 | 1 | | `\u0E33` (SARA AM) | 0 | 1 | | `คำ` (common Thai word) | 1 | 2 | | `\u0EB2` (Lao AA) | 0 | 1 | | `\u0EB3` (Lao AM) | 0 | 1 | --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: Jarred Sumner <jarred@jarredsumner.com> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>