Fix HTML parse with empty lines #537

facelessuser · 2017-01-24T06:25:54Z

If both open and close was not found in first block, additional blocks
were evaluated without context of previous blocks. The algorithm needs
to evaluate a buffer with the left bracket present. So feed in all
items and get the right bracket, then adjust the data_index to be
relative to the last block. Ref: #452

facelessuser · 2017-01-24T06:28:16Z

Wasn't planning on fixing anything else, but I overlooked this as one that could be fixed pre 3.0. I don't know about the HTML parsing algorithm on a whole (seems it takes some shortcuts), but I was at least able to identify why it couldn't resolve the right bracket across a gap.

If both open and close was not found in first block, additional blocks were evaluated without context of previous blocks. The algorithm needs to evaluate a buffer with the left bracket present. So feed in all items and get the right bracket, then adjust the data_index to be relative to the last block.

facelessuser · 2017-01-24T13:59:09Z

Personal opinion on this. For 3.0, I would either rewrite this to not use recursion and also track a state of depth. So when you call it in subsequent blocks, it knows how many openings it has to resolve until done. I don't recall if this is already on your roadmap or not. This current algorithm gets slower the more blocks it has to process. For every block it doesn't find the end tag it is looking for, it has to process i + 1 blocks. So if you had to process 3 blocks to find your close tag, you actually processed 1 + 2 + 3 = 6 blocks (re-iterating blocks you already processed). I think this is a fine fix for now as that is how the algorithm is designed, but not ideal for the future. I wanted to fix the current algorithm, but I wasn't willing at this time to tackle rewriting. Maybe if I'm more ambitious in the future or you haven't gotten to it first :).

waylan merged commit 94962cb into Python-Markdown:master Jan 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix HTML parse with empty lines #537

Fix HTML parse with empty lines #537

Uh oh!

facelessuser commented Jan 24, 2017

Uh oh!

facelessuser commented Jan 24, 2017

Uh oh!

facelessuser commented Jan 24, 2017

Uh oh!

Uh oh!

Fix HTML parse with empty lines #537

Fix HTML parse with empty lines #537

Uh oh!

Conversation

facelessuser commented Jan 24, 2017

Uh oh!

facelessuser commented Jan 24, 2017

Uh oh!

facelessuser commented Jan 24, 2017

Uh oh!

Uh oh!