You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Single and double quotes come from ancient times of language creation when the need for multiline strings was not taken into account. Backticks appeared much later and thus are more versatile.
52
52
53
-
Backticks also allow us to specify a "template function" before the first backtick. The syntax is: <code>func`string`</code>. The function `func` is called automatically, receives the string and embedded expressions and can process them. This is called "tagged templates". This feature makes it easier to implement custom templating, but is rarely used in practice. You can read more about it in the [manual](mdn:/JavaScript/Reference/Template_literals#Tagged_templates).
53
+
Backticks also allow us to specify a "template function" before the first backtick. The syntax is: <code>func`string`</code>. The function `func` is called automatically, receives the string and embedded expressions and can process them. This is called "tagged templates". This feature makes it easier to implement custom templating, but is rarely used in practice. You can read more about it in the [manual](mdn:/JavaScript/Reference/Template_literals#Tagged_templates).
54
54
55
55
## Special characters
56
56
@@ -86,16 +86,16 @@ Here's the full list:
86
86
|`\\`|Backslash|
87
87
|`\t`|Tab|
88
88
|`\b`, `\f`, `\v`| Backspace, Form Feed, Vertical Tab -- kept for compatibility, not used nowadays. |
89
-
|`\xXX`|Unicode character with the given hexadecimal unicode`XX`, e.g. `'\x7A'` is the same as `'z'`.|
|`\u{X…XXXXXX}` (1 to 6 hex characters)|A unicode symbol with the given UTF-32 encoding. Some rare characters are encoded with two unicode symbols, taking 4 bytes. This way we can insert long codes. |
89
+
|`\xXX`|Unicode character with the given hexadecimal Unicode`XX`, e.g. `'\x7A'` is the same as `'z'`.|
|`\u{X…XXXXXX}` (1 to 6 hex characters)|A Unicode symbol with the given UTF-32 encoding. Some rare characters are encoded with two Unicode symbols, taking 4 bytes. This way we can insert long codes. |
alert( "\u{20331}" ); // 佫, a rare Chinese hieroglyph (long unicode)
98
-
alert( "\u{1F60D}" ); // 😍, a smiling face symbol (another long unicode)
97
+
alert( "\u{20331}" ); // 佫, a rare Chinese hieroglyph (long Unicode)
98
+
alert( "\u{1F60D}" ); // 😍, a smiling face symbol (another long Unicode)
99
99
```
100
100
101
101
All special characters start with a backslash character `\`. It is also called an "escape character".
@@ -499,7 +499,7 @@ All strings are encoded using [UTF-16](https://en.wikipedia.org/wiki/UTF-16). Th
499
499
alert( String.fromCodePoint(90) ); // Z
500
500
```
501
501
502
-
We can also add unicode characters by their codes using `\u` followed by the hex code:
502
+
We can also add Unicode characters by their codes using `\u` followed by the hex code:
503
503
504
504
```js run
505
505
// 90 is 5a in hexadecimal system
@@ -608,7 +608,7 @@ In many languages there are symbols that are composed of the base character with
608
608
609
609
For instance, the letter `a` can be the base character for: `àáâäãåā`. Most common "composite" character have their own code in the UTF-16 table. But not all of them, because there are too many possible combinations.
610
610
611
-
To support arbitrary compositions, UTF-16 allows us to use several unicode characters: the base character followed by one or many "mark" characters that "decorate" it.
611
+
To support arbitrary compositions, UTF-16 allows us to use several Unicode characters: the base character followed by one or many "mark" characters that "decorate" it.
612
612
613
613
For instance, if we have `S` followed by the special "dot above" character (code `\u0307`), it is shown as Ṡ.
614
614
@@ -626,7 +626,7 @@ For example:
626
626
alert( 'S\u0307\u0323' ); // Ṩ
627
627
```
628
628
629
-
This provides great flexibility, but also an interesting problem: two characters may visually look the same, but be represented with different unicode compositions.
629
+
This provides great flexibility, but also an interesting problem: two characters may visually look the same, but be represented with different Unicode compositions.
Copy file name to clipboardExpand all lines: 4-binary/02-text-decoder/article.md
+1-1
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,7 @@ let decoder = new TextDecoder([label], [options]);
12
12
-**`label`** -- the encoding, `utf-8` by default, but `big5`, `windows-1251` and many other are also supported.
13
13
-**`options`** -- optional object:
14
14
-**`fatal`** -- boolean, if `true` then throw an exception for invalid (non-decodable) characters, otherwise (default) replace them with character `\uFFFD`.
15
-
-**`ignoreBOM`** -- boolean, if `true` then ignore BOM (an optional byte-order unicode mark), rarely needed.
15
+
-**`ignoreBOM`** -- boolean, if `true` then ignore BOM (an optional byte-order Unicode mark), rarely needed.
Copy file name to clipboardExpand all lines: 9-regular-expressions/03-regexp-unicode/article.md
+4-4
Original file line number
Diff line number
Diff line change
@@ -4,9 +4,9 @@ JavaScript uses [Unicode encoding](https://en.wikipedia.org/wiki/Unicode) for st
4
4
5
5
That range is not big enough to encode all possible characters, that's why some rare characters are encoded with 4 bytes, for instance like `𝒳` (mathematical X) or `😄` (a smile), some hieroglyphs and so on.
There's a unicode property `Script` (a writing system), that may have a value: `Cyrillic`, `Greek`, `Arabic`, `Han` (Chinese) and so on, [here's the full list](https://en.wikipedia.org/wiki/Script_(Unicode)).
124
+
There's a Unicode property `Script` (a writing system), that may have a value: `Cyrillic`, `Greek`, `Arabic`, `Han` (Chinese) and so on, [here's the full list](https://en.wikipedia.org/wiki/Script_(Unicode)).
125
125
126
126
To look for characters in a given writing system we should use `pattern:Script=<value>`, e.g. for Cyrillic letters: `pattern:\p{sc=Cyrillic}`, for Chinese hieroglyphs: `pattern:\p{sc=Han}`, and so on:
Copy file name to clipboardExpand all lines: 9-regular-expressions/08-regexp-character-sets-and-ranges/article.md
+4-4
Original file line number
Diff line number
Diff line change
@@ -57,16 +57,16 @@ For instance:
57
57
58
58
- **\d** -- is the same as `pattern:[0-9]`,
59
59
- **\w** -- is the same as `pattern:[a-zA-Z0-9_]`,
60
-
- **\s** -- is the same as `pattern:[\t\n\v\f\r ]`, plus few other rare unicode space characters.
60
+
- **\s** -- is the same as `pattern:[\t\n\v\f\r ]`, plus few other rare Unicode space characters.
61
61
```
62
62
63
63
### Example: multi-language \w
64
64
65
65
As the character class `pattern:\w` is a shorthand for `pattern:[a-zA-Z0-9_]`, it can't find Chinese hieroglyphs, Cyrillic letters, etc.
66
66
67
-
We can write a more universal pattern, that looks for wordly characters in any language. That's easy with unicode properties: `pattern:[\p{Alpha}\p{M}\p{Nd}\p{Pc}\p{Join_C}]`.
67
+
We can write a more universal pattern, that looks for wordly characters in any language. That's easy with Unicode properties: `pattern:[\p{Alpha}\p{M}\p{Nd}\p{Pc}\p{Join_C}]`.
68
68
69
-
Let's decipher it. Similar to `pattern:\w`, we're making a set of our own that includes characters with following unicode properties:
69
+
Let's decipher it. Similar to `pattern:\w`, we're making a set of our own that includes characters with following Unicode properties:
70
70
71
71
-`Alphabetic` (`Alpha`) - for letters,
72
72
-`Mark` (`M`) - for accents,
@@ -85,7 +85,7 @@ let str = `Hi 你好 12`;
85
85
alert( str.match(regexp) ); // H,i,你,好,1,2
86
86
```
87
87
88
-
Of course, we can edit this pattern: add unicode properties or remove them. Unicode properties are covered in more details in the article <info:regexp-unicode>.
88
+
Of course, we can edit this pattern: add Unicode properties or remove them. Unicode properties are covered in more details in the article <info:regexp-unicode>.
89
89
90
90
```warn header="Unicode properties aren't supported in Edge and Firefox"
91
91
Unicode properties `pattern:p{…}` are not yet implemented in Edge and Firefox. If we really need them, we can use library [XRegExp](http://xregexp.com/).
0 commit comments