Skip to content

Commit 94962cb

Browse files
facelessuserwaylan
authored andcommitted
Fix HTML parse with empty lines (#537)
If both open and close was not found in first block, additional blocks were evaluated without context of previous blocks. The algorithm needs to evaluate a buffer with the left bracket present. So feed in all items and get the right bracket, then adjust the data_index to be relative to the last block. Fixes #452.
1 parent facfd66 commit 94962cb

File tree

3 files changed

+29
-1
lines changed

3 files changed

+29
-1
lines changed

markdown/preprocessors.py

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -258,7 +258,13 @@ def run(self, lines):
258258
else:
259259
items.append(block)
260260

261-
right_tag, data_index = self._get_right_tag(left_tag, 0, block)
261+
# Need to evaluate all items so we can calculate relative to the left index.
262+
right_tag, data_index = self._get_right_tag(left_tag, left_index, ''.join(items))
263+
# Adjust data_index: relative to items -> relative to last block
264+
prev_block_length = 0
265+
for item in items[:-1]:
266+
prev_block_length += len(item)
267+
data_index -= prev_block_length
262268

263269
if self._equal_tags(left_tag, right_tag):
264270
# if find closing tag

tests/misc/html.html

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,17 @@ <h1>Block level html</h1>
88
Html with various attributes.
99
</div>
1010

11+
<div>
12+
<div>
13+
Div with a blank line
14+
15+
in the middle.
16+
</div>
17+
<div>
18+
This gets treated as HTML.
19+
</div>
20+
</div>
21+
1122
<p>And of course <script>blah</script>.</p>
1223
<p><a href="script&gt;stuff&lt;/script">this <script>link</a></p>
1324
<p>Some funky <x\]> inline stuff with markdown escaping syntax.</p>

tests/misc/html.txt

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,17 @@ Now some <arbitrary>arbitrary tags</arbitrary>.
1111
Html with various attributes.
1212
</div>
1313

14+
<div>
15+
<div>
16+
Div with a blank line
17+
18+
in the middle.
19+
</div>
20+
<div>
21+
This gets treated as HTML.
22+
</div>
23+
</div>
24+
1425
And of course <script>blah</script>.
1526

1627
[this <script>link](<script>stuff</script>)

0 commit comments

Comments
 (0)