Files
robobun 0e98e44714 fix(fs): harden path length validation for multi-byte UTF-8 strings (#27495)
## Summary

- Fix `PathLike.fromBunString` to validate path length using
`str.utf8ByteLength()` instead of `str.length()` (UTF-16 code units),
since paths are stored as UTF-8 in a fixed-size `PathBuffer`
- Add defensive bounds check in `sliceZWithForceCopy` before `@memcpy`
to guard against oversized slices
- Add tests covering multi-byte UTF-8 paths (CJK, accented, emoji
characters) that exceed the buffer capacity

## Background

`PathLike.fromBunString` validated path length using `str.length()`,
which returns UTF-16 code units. However, the path is ultimately stored
as UTF-8 in a fixed-size `PathBuffer` (`[MAX_PATH_BYTES]u8`, 4096 bytes
on Linux). Multi-byte UTF-8 characters (CJK, accented, emoji) use 2-4
bytes per character, so a string that passes the UTF-16 length check
could exceed the buffer capacity after UTF-8 encoding. For example, 2000
CJK characters (U+4E00) = 2000 UTF-16 code units (passes the 4096 check)
but 6000 UTF-8 bytes (overflows the buffer).

## Test plan

- [x] `bun bd test test/js/node/fs/fs-path-length.test.ts` — 8/8 tests
pass
- [x] `USE_SYSTEM_BUN=1 bun test test/js/node/fs/fs-path-length.test.ts`
— crashes with segfault (confirms the issue exists pre-fix)
- [x] Tests cover sync APIs (openSync, readFileSync, statSync,
realpathSync), async APIs (promises.readFile, promises.stat), 2-byte
(é), 3-byte (一), and 4-byte (😀) UTF-8 characters

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Bot <claude-bot@bun.sh>
Co-authored-by: Claude <noreply@anthropic.com>
2026-02-28 23:37:43 -08:00
..
2025-01-21 10:28:35 -08:00
2023-03-07 12:22:34 -08:00
2023-03-07 12:22:34 -08:00
2024-01-26 20:07:33 -08:00
2023-03-07 12:22:34 -08:00
2024-06-27 14:56:07 -07:00
2023-03-07 12:22:34 -08:00