whatwg · annevk · Apr 23, 2025 · Apr 11, 2025 · Apr 14, 2025 · Apr 16, 2025
diff --git a/encoding.bs b/encoding.bs
@@ -10,6 +10,18 @@ Markup Shorthands: css off
 Translate IDs: dictdef-textdecoderoptions textdecoderoptions,dictdef-textdecodeoptions textdecodeoptions,index section-index
 </pre>
 
+<pre class=biblio>
+{
+ "ISO8859-1": {
+  "href": "https://www.iso.org/standard/28245.html",
+  "title": "Information technology — 8-bit single-byte coded graphic character sets — Part 1: Latin alphabet No. 1",
+  "publisher": "International Organization for Standardization (ISO)",
+  "status": "Published",
+  "date": "April 1998"
+ }
+}
+</pre>
+
 <link rel=stylesheet href=visualization-colors.css>
 
 
@@ -568,7 +580,10 @@ prescribes, as that is necessary to be compatible with deployed content.
   <tr><td>"<code>windows-1251</code>"
   <tr><td>"<code>x-cp1251</code>"
   <tr>
-   <td rowspan=17><a>windows-1252</a>
+   <td rowspan=17>
+    <a>windows-1252</a>
+    <p class=note>See <a href="#note-latin1-ascii">below</a> for the relationship to historical
+    "Latin1" and "ASCII" concepts.
    <td>"<code>ansi_x3.4-1968</code>"
   <tr><td>"<code>ascii</code>"
   <tr><td>"<code>cp1252</code>"
@@ -732,6 +747,30 @@ part of the ISO 8859 series. In particular, the necessity of the inclusion of <a
 and <a>ISO-8859-16</a> is doubtful for the purpose of supporting existing content, but there are no
 plans to remove these.</p>
 
+<div class=note id=note-latin1-ascii>
+ <p>The <a>windows-1252</a> <a for=/>encoding</a> has various <a for=encoding>labels</a> like
+ "<code>latin1</code>", "<code>iso-8859-1</code>", "<code>ascii</code>", etc. which have
+ historically been confusing for developers. On the web, and in any software that seeks to be
+ web-compatible by implementing the Encoding Standard, these are synonyms: "<code>latin1</code>" and
+ "<code>ascii</code>" are just labels for <a>windows-1252</a>, and any software following this
+ standard will, for example, decode 0x80 as U+20AC (€) when asked for the Latin1 or ASCII decoding
+ of that byte.
+
+ <p>Software that does not follow the Encoding Standard does not always give the same answers. The
+ root of this is that the original document that specified Latin1 (ISO/IEC 8859-1), did not provide
+ any mappings for bytes in the inclusive ranges 0x00–0x1F or 0x7F–0x9F. Similarly, the original
+ documents that specified ASCII (ISO/IEC 646, among others) did not provide any mappings for bytes
+ in the inclusive range 0x80–0xFF. This means different software has chosen different code point
+ mappings for those bytes when asked to use Latin1 or ASCII encodings. Web browsers and
+ browser-compatible software have chosen to map those bytes according to <a>windows-1252</a>, which
+ is a superset of both, and this was codified in the Encoding Standard. Other software throws
+ errors, or uses <a>isomorphic decoding</a>, or other mappings. [[ISO8859-1]] [[ISO646]]
+
+ <p>As such, implementers and developers need to be careful whenever they are using libraries which
+ expose APIs in terms of "Latin1" or "ASCII". It's very possible such libraries will not give
+ answers in line with the Encoding Standard, if they have chosen other behaviors for the bytes which
+ were left undefined in the original specifications.
+</div>
 
 <h3 id=output-encodings>Output encodings</h3>