Skip to content

Commit 60e5e69

Browse files
committed
Move UTF-8 support to a future proposal
1 parent 8f2e8dc commit 60e5e69

File tree

1 file changed

+17
-17
lines changed

1 file changed

+17
-17
lines changed

proposals/js-string-builtins/Overview.md

+17-17
Original file line numberDiff line numberDiff line change
@@ -157,14 +157,6 @@ For this purpose, `WebAssembly.validate()` is extended to take a list of builtin
157157

158158
If a user wishes to polyfill these imports for some reason, or is running on a system without a builtin, these imports may be provided as normal through instantiation.
159159

160-
## UTF8/WTF8 support
161-
162-
As stated above in 'goals for builtins', builtins are intended to just wrap existing primitives and not invent new functionality.
163-
164-
JS Strings are semantically a sequence of 16-bit code units (referred to as char codes in method naming), and there are no builtin operations on them to acquire a UTF-8 or WTF-8 view. This makes it difficult to write Wasm builtins for these encodings without introducing significant new logic to them.
165-
166-
There is the Encoding API for `TextEncoder`/`TextDecoder` which can be used for UTF-8 support. However, this is technically a separate spec from JS and may not be available on all JS engines (in practice it's available widely). This proposal exposes UTF-8 data conversions using this API under separate `wasm:text-encoder` `wasm:text-decoder` interfaces which are available when the host implements these interfaces.
167-
168160
## String constants
169161

170162
String constants may be defined in JS and made available to Wasm through a variety of means.
@@ -551,7 +543,19 @@ function compare(
551543
}
552544
```
553545

554-
## Encoding API
546+
## Future extensions
547+
548+
There are several extensions we can make in the future as need arrives.
549+
550+
### UTF8/WTF8 support
551+
552+
As stated above in 'goals for builtins', builtins are intended to just wrap existing primitives and not invent new functionality.
553+
554+
JS Strings are semantically a sequence of 16-bit code units (referred to as char codes in method naming), and there are no builtin operations on them to acquire a UTF-8 or WTF-8 view. This makes it difficult to write Wasm builtins for these encodings without introducing significant new logic to them.
555+
556+
There is the Encoding API for `TextEncoder`/`TextDecoder` which can be used for UTF-8 support. However, this is technically a separate spec from JS and may not be available on all JS engines (in practice it's available widely). This proposal exposes UTF-8 data conversions using this API under separate `wasm:text-encoder` `wasm:text-decoder` interfaces which are available when the host implements these interfaces.
557+
558+
### Encoding API
555559

556560
The following is an initial set of function builtins for the [`TextEncoder`](https://encoding.spec.whatwg.org/#interface-textencoder) and the [`TextDecoder`](https://encoding.spec.whatwg.org/#interface-textdecoder) interfaces. These builtins are exposed under `wasm:text-encoder` and `wasm:text-decoder`, respectively.
557561

@@ -607,7 +611,7 @@ function trap() {
607611
}
608612
```
609613

610-
### "wasm:text-decoder" "decodeStringFromUTF8Array"
614+
#### "wasm:text-decoder" "decodeStringFromUTF8Array"
611615

612616
```
613617
/// Decode the specified range of an i8 array using UTF-8 into a string.
@@ -656,7 +660,7 @@ func decodeStringFromUTF8Array(
656660
}
657661
```
658662

659-
### "wasm:text-encoder" "measureStringAsUTF8"
663+
#### "wasm:text-encoder" "measureStringAsUTF8"
660664

661665
```
662666
/// Returns the number of bytes string would take when encoded as UTF-8.
@@ -684,7 +688,7 @@ func measureStringAsUTF8(
684688
}
685689
```
686690

687-
### "wasm:text-encoder" "encodeStringIntoUTF8Array"
691+
#### "wasm:text-encoder" "encodeStringIntoUTF8Array"
688692

689693
```
690694
/// Encode a string into a pre-allocated mutable i8 array at `start` index using
@@ -731,7 +735,7 @@ func encodeStringIntoUTF8Array(
731735
}
732736
```
733737

734-
### "wasm:text-encoder" "encodeStringToUTF8Array"
738+
#### "wasm:text-encoder" "encodeStringToUTF8Array"
735739

736740
```
737741
/// Encode a string into a new mutable i8 array using UTF-8.
@@ -760,10 +764,6 @@ func encodeStringToUTF8Array(
760764
}
761765
```
762766

763-
## Future extensions
764-
765-
There are several extensions we can make in the future as need arrives.
766-
767767
### Binding memory to builtins
768768

769769
It may be useful to have a builtin that operates on a specific Wasm memory. For JS strings, this could allow us to encode a JS string directly into linear memory.

0 commit comments

Comments
 (0)