Fix bun.String.toOwnedSliceReturningAllASCII (#23925)

`bun.String.toOwnedSliceReturningAllASCII` is supposed to return a
boolean indicating whether or not the string is entirely composed of
ASCII characters. However, the current implementation frequently
produces incorrect results:

* If the string is a `ZigString`, it always returns true, even though
`ZigString`s can be UTF-16 or Latin-1.
* If the string is a `StaticZigString`, it always returns false, even
though `StaticZigStrings` can be all ASCII.
* If the string is a 16-bit `WTFStringImpl`, it always returns false,
even though 16-bit `WTFString`s can be all ASCII.
* If the string is empty, it always returns false, even though empty
strings are valid ASCII strings.

`toOwnedSliceReturningAllASCII` is currently used in two places, both of
which assume its answer is accurate:

* `bun.webcore.Blob.fromJSWithoutDeferGC`
* `bun.api.ServerConfig.fromJS`

(For internal tracking: fixes ENG-21249)
This commit is contained in:
taylor.fish
2025-10-21 18:42:39 -07:00
committed by GitHub
parent 06eea5213a
commit d846e9a1e7
4 changed files with 55 additions and 33 deletions

View File

@@ -10,6 +10,19 @@ pub const Encoding = enum {
utf16,
};
pub const AsciiStatus = enum {
unknown,
all_ascii,
non_ascii,
pub fn fromBool(is_all_ascii: ?bool) AsciiStatus {
return if (is_all_ascii orelse return .unknown)
.all_ascii
else
.non_ascii;
}
};
/// Returned by classification functions that do not discriminate between utf8 and ascii.
pub const EncodingNonAscii = enum {
utf8,