@@ -48,13 +48,12 @@ the object, edit the `JSVAR` shell variable in `make.sh` and run the script.
48
48
The utilities functions are contained in ` cputils.js ` , which assumes that the
49
49
appropriate codepage scripts were loaded.
50
50
51
- The script will manipulate ` module.exports ` if available (e.g. in a CommonJS
52
- ` require ` context). This is not always desirable. To prevent the behavior,
53
- define ` DO_NOT_EXPORT_CODEPAGE ` .
51
+ The script will manipulate ` module.exports ` if available . This is not always
52
+ desirable. To prevent the behavior, define ` DO_NOT_EXPORT_CODEPAGE ` .
54
53
55
54
## Usage
56
55
57
- Most codepages are indexed by number. To get the unicode character for a given
56
+ Most codepages are indexed by number. To get the Unicode character for a given
58
57
codepoint, use the ` dec ` property:
59
58
60
59
``` js
@@ -79,8 +78,8 @@ var sbuf = cptable.utils.encode(65001, sushi);
79
78
` cptable.utils.encode(CP, data, ofmt) ` accepts a String or Array of characters
80
79
and returns a representation controlled by ` ofmt ` :
81
80
82
- - Default output is a Buffer (or Array) of bytes (integers between 0 and 255).
83
- - If ` ofmt == 'str' ` , return a String where ` o.charCodeAt(i) ` is the ` i ` -th byte
81
+ - Default output is a Buffer (or Array) of bytes (integers between 0 and 255)
82
+ - If ` ofmt == 'str' ` , return a binary String (byte ` i ` is ` o.charCodeAt(i) ` )
84
83
- If ` ofmt == 'arr' ` , return an Array of bytes
85
84
86
85
` cptable.utils.decode(CP, data) ` accepts a byte String or Array of numbers or
@@ -112,7 +111,7 @@ $ bash make.sh path_to_manifest output_file_name JSVAR
112
111
where
113
112
114
113
- ` JSVAR ` is the name of the exported variable (generally ` cptable ` )
115
- - ` output_file_name ` is the output file (e.g. ` cpexcel.js ` , ` cptable.js ` )
114
+ - ` output_file_name ` is the output file (` cpexcel.js ` , ` cptable.js ` , ... )
116
115
- ` path_to_manifest ` is the path to the manifest file.
117
116
118
117
The manifest file is expected to be a CSV with 3 columns:
@@ -138,10 +137,10 @@ the JS source is `codepage.md`, so building is as simple as `voc codepage.md`.
138
137
139
138
## Generated Codepages
140
139
141
- The complete list of hardcoded codepages can be found in the file ` pages.csv ` .
140
+ The complete list of codepages can be found in the file ` pages.csv ` .
142
141
143
- Some codepages are easier to implement algorithmically. Since these are
144
- hardcoded in ` utils ` , there is no corresponding entry (they are "magic").
142
+ Some codepages are easier to implement algorithmically. Since those character
143
+ tables are not generated , there is no corresponding entry (they are "magic").
145
144
146
145
| CP# | Source | Description |
147
146
| --------:| :-----------:| :-----------------------------------------------------|
@@ -222,7 +221,7 @@ hardcoded in `utils`, there is no corresponding entry (they are "magic").
222
221
| ` 12001 ` | magic | Unicode UTF-32, big endian byte order |
223
222
| ` 20000 ` | Windows 7 | CNS Taiwan (Chinese Traditional) |
224
223
| ` 20001 ` | Windows 7 | TCA Taiwan |
225
- | ` 20002 ` | Windows 7 | Eten Taiwan (Chinese Traditional) |
224
+ | ` 20002 ` | Windows 7 | ETEN Taiwan (Chinese Traditional) |
226
225
| ` 20003 ` | Windows 7 | IBM5550 Taiwan |
227
226
| ` 20004 ` | Windows 7 | TeleText Taiwan |
228
227
| ` 20005 ` | Windows 7 | Wang Taiwan |
@@ -299,21 +298,21 @@ hardcoded in `utils`, there is no corresponding entry (they are "magic").
299
298
| ` 65001 ` | magic | Unicode (UTF-8) |
300
299
301
300
` unicode.org ` refers to the Unicode Consortium Public Mappings, a database of
302
- various mappings between unicode characters and respective character sets. The
301
+ various mappings between Unicode characters and respective character sets. The
303
302
tables are processed by a few scripts in the build process.
304
303
305
304
` IBM ` refers to the IBM coded character set database. Even though IBM uses a
306
305
different numbering scheme from Windows, the IBM numbers are used when there is
307
- no conflict. The tables are manually generated from the symbol PDFs .
306
+ no conflict. The tables are manually generated from the symbol manifests .
308
307
309
308
` Windows 7 ` refers to direct inspection of Windows 7 machines using .NET class
310
309
` System.Text.Encoding ` . The enclosed ` MakeEncoding.cs ` C# program brute-forces
311
- code pages. MakeEncoding.cs deviates from unicode.org in some cases. When they
310
+ code pages. ` MakeEncoding.cs ` deviates from unicode.org in some cases. When they
312
311
map a given code to different characters, unicode.org value is used. When
313
- unicode.org does not prescribe a value, MakeEncoding.cs value is used.
312
+ unicode.org does not prescribe a value, ` MakeEncoding.cs ` value is used.
314
313
315
314
` NLS ` refers to the National Language Support files supplied in various versions
316
- of Windows. In older versions of Windows (e.g. Windows 98) these files followed
315
+ of Windows. In older versions of Windows (like Windows 98) these files followed
317
316
the name pattern ` CP_#.NLS ` , but newer versions use the name pattern ` C_#.NLS ` .
318
317
319
318
## Testing
0 commit comments