Skip to content

Commit e7e105f

Browse files
[3.14] gh-86155: Fix data loss after unclosed script or style tag in HTMLParser (GH-22658) (GH-133844)
When calling .close() the HTMLParser should flush all remaining content, even when that content is in an unclosed script or style tag. (cherry picked from commit 53383e9) Co-authored-by: Waylan Limberg <[email protected]>
1 parent 856e590 commit e7e105f

File tree

3 files changed

+13
-1
lines changed

3 files changed

+13
-1
lines changed

Lib/html/parser.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -260,7 +260,7 @@ def goahead(self, end):
260260
else:
261261
assert 0, "interesting.search() lied"
262262
# end while
263-
if end and i < n and not self.cdata_elem:
263+
if end and i < n:
264264
if self.convert_charrefs and not self.cdata_elem:
265265
self.handle_data(unescape(rawdata[i:n]))
266266
else:

Lib/test/test_htmlparser.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -317,6 +317,16 @@ def get_events(self):
317317
("endtag", element_lower)],
318318
collector=Collector(convert_charrefs=False))
319319

320+
def test_EOF_in_cdata(self):
321+
content = """<!-- not a comment --> &not-an-entity-ref;
322+
<a href="" /> </p><p> <span></span></style>
323+
'</script' + '>'"""
324+
s = f'<script>{content}'
325+
self._run_check(s, [
326+
("starttag", 'script', []),
327+
("data", content)
328+
])
329+
320330
def test_comments(self):
321331
html = ("<!-- I'm a valid comment -->"
322332
'<!--me too!-->'
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
:meth:`html.parser.HTMLParser.close` no longer loses data when the
2+
``<script>`` tag is not closed. Patch by Waylan Limberg.

0 commit comments

Comments
 (0)