robobun
9fd5b20aa3
feat: Add WebKit text codec support for 24 additional encodings ( #21835 )
...
## Summary
This PR integrates WebKit's text codec implementations into Bun's
TextDecoder, adding support for 24 additional character encodings beyond
the native UTF-8, UTF-16, and Latin1.
Fixes https://github.com/oven-sh/bun/issues/11564
## What's New
### Supported Encodings (24 total)
- **11 single-byte encodings**: IBM866, ISO-8859-3/6/7/8/8-I, KOI8-U,
windows-874/1253/1255/1257
- **7 CJK encodings**: Big5, EUC-JP, ISO-2022-JP, Shift_JIS, EUC-KR,
GBK, GB18030
- **2 special encodings**: x-user-defined, replacement
### Implementation Details
- Integrated WebKit's text codec C++ implementations
- Generated static encoding tables from WHATWG spec (no ICU dependency)
- Created C++ wrapper for Zig/C++ interop
- All encoding aliases are supported (e.g., `sjis` → `shift_jis`)
- Proper whitespace trimming for encoding labels
## Testing
- ✅ Added comprehensive tests for all supported encodings
- ✅ Passes Web Platform Tests for single-byte decoders
- ✅ Passes Web Platform Tests for encoding labels
- ✅ All 2,227 tests pass
## Test Output
```
bun test v1.2.19 (9feaab47 )
2207 pass
0 fail
5012 expect() calls
Ran 2207 tests across 1 file. [899.00ms]
```
## Not Included
The following encodings were not added due to ICU data loading
constraints:
- ISO-8859-2, 4, 5, 10, 13, 14, 15, 16
- Windows-1250, 1251, 1254, 1256, 1258
- KOI8-R, macintosh, x-mac-cyrillic
## Example Usage
```javascript
// CJK encodings
const decoder = new TextDecoder("shift_jis");
const bytes = new Uint8Array([0x82, 0xb1, 0x82, 0xf1]);
console.log(decoder.decode(bytes)); // "こん"
// Single-byte encodings
const greekDecoder = new TextDecoder("iso-8859-7");
const greekBytes = new Uint8Array([0xC3, 0xe5, 0xe9, 0xdc]);
console.log(greekDecoder.decode(greekBytes)); // "Γειά"
```
🤖 Generated with [Claude Code](https://claude.ai/code )
---------
Co-authored-by: Claude <claude@anthropic.ai >
Co-authored-by: Claude <noreply@anthropic.com >
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-08-14 22:58:25 -07:00
190n
e23491391b
bun run prettier (#19807 )
...
Co-authored-by: 190n <7763597+190n@users.noreply.github.com >
2025-05-20 20:01:38 -07:00
Braden Everson
67b64c3334
Update TextDecoder's constructor to Handle Undefined ( #19708 )
...
Co-authored-by: Dylan Conway <35280289+dylan-conway@users.noreply.github.com >
2025-05-19 16:44:57 -07:00
pfg
a7b46ebbfe
fastGet can throw ( #19506 )
...
Co-authored-by: Meghan Denny <meghan@bun.sh >
Co-authored-by: Jarred Sumner <jarred@jarredsumner.com >
2025-05-14 22:14:20 -07:00
Jarred Sumner
14b439a115
Fix formatters not running in CI + delete unnecessary files ( #19433 )
2025-05-08 23:22:16 -07:00
pfg
f18a6d7be7
test-whatwg-encoding-custom-textdecoder-api-invalid-label.js ( #19430 )
2025-05-02 04:04:44 -07:00
ALBIN BABU VARGHESE
fbbc16fec6
Fixed TextDecoder fatal option showing invalid arg when giving 0 or 1 ( #19378 )
...
Co-authored-by: Albin <albinbabuvarghese@gmail.com >
2025-04-30 14:47:58 -07:00
Dylan Conway
87279392cf
fix 9395 ( #14815 )
2024-10-25 19:58:45 -07:00
Jarred Sumner
cd6785771e
run prettier and add back format action ( #13722 )
2024-09-03 21:32:52 -07:00
Dylan Conway
5bd344281f
fix(TextEncoder): domjit crash in encode ( #13320 )
...
Co-authored-by: Jarred Sumner <jarred@jarredsumner.com >
2024-08-15 03:35:58 -07:00
Jarred Sumner
3a245dd248
upgrade webkit ( #13192 )
...
Co-authored-by: Dylan Conway <dylan.conway567@gmail.com >
Co-authored-by: Zack Radisic <zack@theradisic.com >
2024-08-12 23:17:17 -07:00
Dylan Conway
9302b42919
revert 84c91bf7e1 ( #13214 )
2024-08-09 19:28:08 -07:00
Ashcon Partovi
84c91bf7e1
Revert TextDecoderStream until next release ( #13151 )
2024-08-07 12:34:04 -07:00
Dylan Conway
9f7c6e34cb
Add TextDecoderStream, TextEncoderStream, and TextDecoder.decode("", { stream: true}) ( #13115 )
2024-08-07 02:36:29 -07:00
Dylan Conway
6303af3ce0
fix(TextDecoder): decoding sequences starting with 192 or 193 ( #13043 )
2024-08-02 23:01:34 -07:00
Ciro Spaciari
1ba57351b0
fix(Bun.serve) fix mimetype with utf16 ( #11695 )
...
Co-authored-by: Jarred Sumner <jarred@jarredsumner.com >
2024-06-08 22:34:06 -07:00
Jarred Sumner
4512a04820
Add missing code to TextDecoder "Invalid byte sequence" error ( #9700 )
...
* Fix missing `ERR_ENCODING_INVALID_ENCODED_DATA` code in TextDecoder
* Update text-decoder.test.js
---------
Co-authored-by: Jarred Sumner <709451+Jarred-Sumner@users.noreply.github.com >
2024-03-28 22:06:40 -07:00
Jarred Sumner
47e7e004b1
Remove @known-failing-on-windows for tests which are no longer failing on windows
2024-01-24 21:03:32 -08:00
Jarred Sumner
e848c3f226
Get Bun.write tests to pass on Windows and bun:sqlite tests to pass ( #8393 )
...
* Move ReadFile and WriteFile to separate file
* Use libuv for Bun.write()
* Update windows_event_loop.zig
* build
* Get bun-write tests to pass. Implement Bun.write with two files.
* UPdate
* Update
* Update failing test list
* update
* More
* More
* More
* More
* Mark the rest
* ok
* oops
* Update bun-write.test.js
* Update blob.zig
---------
Co-authored-by: Jarred Sumner <709451+Jarred-Sumner@users.noreply.github.com >
Co-authored-by: Dave Caruso <me@paperdave.net >
Co-authored-by: Georgijs Vilums <georgijs.vilums@gmail.com >
2024-01-23 20:03:56 -08:00
dave caruso
072f2f15ea
ci: run windows tests and also run them concurrently ( #7758 )
2024-01-12 17:02:20 -08:00
WingLim
476fa4deda
feat(encoding): support BOM detection with test passed ( #6074 )
2023-10-03 10:28:59 -07:00
Jarred Sumner
abfc10afeb
Revert "feat(encoding): support BOM detection ( #5550 )"
...
This reverts commit 5f66b4e729 .
This caused test failures in text-encoder. cc @WingLim
2023-09-21 07:10:07 -07:00
Jarred Sumner
01d2cb5d98
Prettier
2023-09-21 00:51:48 -07:00
WingLim
5f66b4e729
feat(encoding): support BOM detection ( #5550 )
...
* fix(encoding): export `getIgnoreBOM`
* feat(encoding): support ignoreBOM
* fix(encoding): not replace BOM to 0xFFFD
* chore: use strict equal
2023-09-20 18:44:05 -07:00
WingLim
a098c6e5f6
feat(encoding): TextDecoder support undefined ( #5387 )
...
* feat(encoding): TextDecoder support undefined
* chore: format test file
2023-09-16 22:41:52 -07:00
Dylan Conway
70a5cfe908
fix text decode trim ( #4495 )
...
* remove trim
* separate function
* a test
* trim when `stream` is true
---------
Co-authored-by: Jarred Sumner <jarred@jarredsumner.com >
2023-09-05 17:53:31 -07:00
Jarred Sumner
ef89f03de6
Update text-decoder.test.js
2023-07-20 15:26:06 -07:00
Julian
c383c6cd81
Pass constructor arguments to TextDecoder ( #3692 )
...
* Make TextDecoder constructor use options parameter
The constructor now actually sets TextDecoder properties using the
options parameter.
* Defer decoder allocation to end of constructor
* Verify types of TextDecoder options
* TextDecoder throw TypeError on failure
* Tidying
2023-07-20 14:50:54 -07:00
Dylan Conway
a9c41c67e6
Fix several bugs ( #2418 )
...
* utf16 codepoint with replacement character
* Fix test failure with `TextEncoder("ascii')`
* Add missing type
* Fix Response.prototype.bodyUsed and Request.prototype.bodyUsed
* Fix bug with scrypt error not clearing
* Update server.zig
* oopsie
2023-03-18 00:55:05 -07:00
Ashcon Partovi
f7e4eb8369
Reorganize tests ( #2332 )
2023-03-07 12:22:34 -08:00