Skip to content

Properly cleanup XML parser contexts upon completion #2239

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

vkrivopalov
Copy link

Description

It is currently possible that the XML parsing context is not properly
cleaned up if a parsed XML document is malformed.

This fix makes sure that the context is taken care of.

Testing

  1. Create a file books_bad.xml with malformed XML:
    <?xml version="1.0"?> <catalog> <book id="bk101"> <author>Gambardella, Matthew</author> <title>XML Developer's Guide</title> <genre>Computer</genre> <price>44.95</price> <publish_date>2000-10-01</publish_date> <description>An in-depth look at creating applications with XML.</description> </book> <book id="bk102"> <author>Ralls, Kim</author> <title>Midnight Rain</title> <genre>Fantasy</genre> <price>5.95</price> <publish_date>2000-12-16</publish_date> <description>A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world.</description> </book> <book id="bk103"> <author>Corets, Eva</author> <title>Maeve Ascendant</title> <genre>Fantasy</genre> <price>5.95</price> <publish_date>2000-11-17</publish_date> <description>After the collapse of a nanotechnology society in England, the young survivors lay the foundation for a new society.</description> </book> <book id="bk104"> <author>Corets, Eva</author> <title>Oberon's Legacy</title> <genre>Fantasy</genre> <price>5.95</price> <publish_date>2001-03-10</publish_date> <description>In post-apocalypse England, the mysterious agent known only as Oberon helps to create a new life for the inhabitants of London. Sequel to Maeve Ascendant.</description> </book> <book id="bk105"> <author>Corets, Eva</author> <title>The Sundered Grail</title> <genre>Fantasy</genre> <price>5.95</price> <publish_date>2001-09-10</publish_date> <description>The two daughters of Maeve, half-sisters, battle one another for control of England. Sequel to Oberon's Legacy.</description> </book> <book id="bk106"> <author>Randall, Cynthia</author> <title>Lover Birds</title> <genre>Romance</genre> <price>4.95</price> <publish_date>2000-09-02</publish_date> <description>When Carla meets Paul at an ornithology conference, tempers fly as feathers get ruffled.</description> </book> <book id="bk107"> <author>Thurman, Paula</author> <title>Splish Splash</title> <genre>Romance</genre> <price>4.95</price> <publish_date>2000-11-02</publish_date> <description>A deep sea diver finds true love twenty thousand leagues beneath the sea.</description> </book> <book id="bk108"> <author>Knorr, Stefan</author> <title>Creepy Crawlies</title> <genre>Horror</genre> <price>4.95</price> <publish_date>2000-12-06</publish_date> <description>An anthology of horror stories about roaches, centipedes, scorpions and other insects.</description> </book> <book id="bk109"> <author>Kress, Peter</author> <title>Paradox Lost</title> <genre>Science Fiction</genre> <price>6.95</price> <publish_date>2000-11-02</publish_date> <description>After an inadvertant trip through a Heisenberg Uncertainty Device, James Salway discovers the problems of being quantum.</description> </book> <book id="bk110"> <author>O'Brien, Tim</author> <title>Microsoft .NET: The Programming Bible</title> <genre>Computer</genre> <price>36.95</price> <publish_date>2000-12-09</publish_date> <description>Microsoft's .NET initiative is explored in detail in this deep programmer's reference.</description> </book> <book id="bk111"> <author>O'Brien, Tim</author> <title>MSXML3: A Comprehensive Guide</title> <genre>Computer</genre> <price>36.95</price> <publish_date>2000-12-01</publish_date> <description>The Microsoft MSXML3 parser is covered in detail, with attention to XML DOM interfaces, XSLT processing, SAX and more.</description> </book> <book id="bk112"> <author>Galos, Mike</author> <title>Visual Studio 7: A Comprehensive Guide</title> <genre>Computer</genre> <price>49.95</price> <publish_date>2001-04-16</publish_date> <description>Microsoft Visual Studio 7 is explored in depth, looking at how Visual Basic, Visual C++, C#, and ASP+ are integrated into a comprehensive development environment. <book id="bk113"> <author>O'Brien, Tim</author> <title>MSXML3: A Comprehensive Guide</title> <genre>Computer</genre> <price>36.95</price> <publish_date>2000-12-01</publish_date> <description>The Microsoft MSXML3 parser is covered in detail, with attention to XML DOM interfaces, XSLT processing, SAX and more.</description> </book> <book id="bk114"> <author>O'Brien, Tim</author> <title>MSXML3: A Comprehensive Guide</title> <genre>Computer</genre> <price>36.95</price> <publish_date>2000-12-01</publish_date> <description>The Microsoft MSXML3 parser is covered in detail, with attention to XML DOM interfaces, XSLT processing, SAX and more.</description> </book> <book id="bk115"> <author>O'Brien, Tim</author> <title>MSXML3: A Comprehensive Guide</title> <genre>Computer</genre> <price>36.95</price> <publish_date>2000-12-01</publish_date> <description>The Microsoft MSXML3 parser is covered in detail, with attention to XML DOM interfaces, XSLT processing, SAX and more.</description> </book> <book id="bk116"> <author>O'Brien, Tim</author> <title>MSXML3: A Comprehensive Guide</title> <genre>Computer</genre> <price>36.95</price> <publish_date>2000-12-01</publish_date> <description>The Microsoft MSXML3 parser is covered in detail, with attention to XML DOM interfaces, XSLT processing, SAX and more.</description> </book> <book id="bk117"> <author>O'Brien, Tim</author> <title>MSXML3: A Comprehensive Guide</title> <genre>Computer</genre> <price>36.95</price> <publish_date>2000-12-01</publish_date> <description>The Microsoft MSXML3 parser is covered in detail, with attention to XML DOM interfaces, XSLT processing, SAX and more.</description> </book> <book id="bk118"> <author>O'Brien, Tim</author> <title>MSXML3: A Comprehensive Guide</title> <genre>Computer</genre> <price>36.95</price> <publish_date>2000-12-01</publish_date> <description>The Microsoft MSXML3 parser is covered in detail, with attention to XML DOM interfaces, XSLT processing, SAX and more.</description> </book> <book id="bk119"> <author>O'Brien, Tim</author> <title>MSXML3: A Comprehensive Guide</title> <genre>Computer</genre> <price>36.95</price> <publish_date>2000-12-01</publish_date> <description>The Microsoft MSXML3 parser is covered in detail, with attention to XML DOM interfaces, XSLT processing, SAX and more.</description> </book> <book id="bk120"> <author>O'Brien, Tim</author> <title>MSXML3: A Comprehensive Guide</title> <genre>Computer</genre> <price>36.95</price> <publish_date>2000-12-01</publish_date> <description>The Microsoft MSXML3 parser is covered in detail, with attention to XML DOM interfaces, XSLT processing, SAX and more.</description> </book> <book id="bk121"> <author>O'Brien, Tim</author> <title>MSXML3: A Comprehensive Guide</title> <genre>Computer</genre> <price>36.95</price> <publish_date>2000-12-01</publish_date> <description>The Microsoft MSXML3 parser is covered in detail, with attention to XML DOM interfaces, XSLT processing, SAX and more.</description> </book> <book id="bk122"> <author>O'Brien, Tim</author> <title>MSXML3: A Comprehensive Guide</title> <genre>Computer</genre> <price>36.95</price> <publish_date>2000-12-01</publish_date> <description>The Microsoft MSXML3 parser is covered in detail, with attention to XML DOM interfaces, XSLT processing, SAX and more.</description> </book> <book id="bk123"> <author>O'Brien, Tim</author> <title>MSXML3: A Comprehensive Guide</title> <genre>Computer</genre> <price>36.95</price> <publish_date>2000-12-01</publish_date> <description>The Microsoft MSXML3 parser is covered in detail, with attention to XML DOM interfaces, XSLT processing, SAX and more.</description> </book> </catalog>

(The closing tag is missing for the element with id bk112)

  1. Run Nginx with ModSecurity under Valgrind, e.g.:
valgrind --leak-check=full --trace-children=yes --track-origins=yes -v --log-file=/tmp/valg.log nginx
  1. Send request with the file as a body:
curl -X POST --header "Content-Type: text/xml" --data-binary @books_bad.xml "http://localhost/"
  1. Stop Nginx and observe Valgrind log.

Before the fix, it contains the following info on leaked blocks:

10878 ==18047== 91,783 (752 direct, 91,031 indirect) bytes in 1 blocks are definitely lost in loss record 750 of 754                              
10879 ==18047==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
10880 ==18047==    by 0x5DC397E: xmlNewParserCtxt (in /usr/lib/x86_64-linux-gnu/libxml2.so.2.9.3)
10881 ==18047==    by 0x5DD8CF9: xmlCreatePushParserCtxt (in /usr/lib/x86_64-linux-gnu/libxml2.so.2.9.3)
10882 ==18047==    by 0x63F932: xml_process_chunk (msc_xml.c:87)
10883 ==18047==    by 0x63A772: modsecurity_request_body_store (msc_reqbody.c:355)
10884 ==18047==    by 0x665E72: read_request_body (apache2_io.c:298) 
10885 ==18047==    by 0x62E986: hook_request_late (mod_security2.c:1042)
10886 ==18047==    by 0x627570: modsecProcessRequestBody (api.c:470)
10887 ==18047==    by 0x5ED402: ngx_http_modsecurity_prevention_thread_func (ngx_http_modsecurity.c:690)
10888 ==18047==    by 0x51B55A: ngx_thread_pool_cycle (ngx_thread_pool.c:342)
10889 ==18047==    by 0x50456B9: start_thread (pthread_create.c:333)
10890 ==18047==    by 0x782F41C: clone (clone.S:109)

After the fix, this log disappears.

It is currently possible that the XML parsing context is not properly
cleaned up if a parsed XML document is malformed.

This fix makes sure that the context is taken care of.

Signed-off-by: Vladimir Krivopalov <[email protected]>
@zimmerle zimmerle self-requested a review February 4, 2020 13:59
@zimmerle zimmerle self-assigned this Feb 4, 2020
@zimmerle zimmerle added the 2.x Related to ModSecurity version 2.x label Feb 4, 2020
@zimmerle
Copy link
Contributor

zimmerle commented Feb 4, 2020

Thank you for the contribution @argenet.

@marcstern
Copy link

We're using this in prod from Feb. in dozens of environments

@martinhsv
Copy link
Contributor

martinhsv commented Dec 31, 2021

Hi @argenet,

I don't see any problems with this pull request as it stands.

However, one thing I'm slightly wary of here is that we have two different pointers:

  • msr->xml->parsing_ctx->myDoc
  • msr->xml->doc

... that in principal can and will point to the same memory.

This code frees the memory pointed to by the first pointer without NULLing out the second one.

With the current code I don't see any code path where msr->xml->doc is non-NULL and your new cleanup code gets executed, so no problem I suppose.

I guess I'm just a little worried that with unknown future evolution of the code that this may sow the seed for a problem.

Is it perhaps worth adding a couple of lines of code after your xmlFreeDoc to check if the two pointers are equal, and if so, set msr->xml->doc to NULL?

Thoughts?

@martinhsv martinhsv self-assigned this Jan 22, 2022
@martinhsv
Copy link
Contributor

I tried this today with v2.9.5 + Apache and was not able to reproduce it.

I see that the steps to reproduce in the description mention running nginx. ModSecurity v2 with nginx is definitely not recommended.

Are there any known cases where there is a problem in ModSecurity v2 with either Apache or IIS?

Copy link

@marcstern marcstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running in prod on 60+ servers for many months

@martinhsv
Copy link
Contributor

All right. I was able to verify this with a slightly different use case.

Thanks for the contribution.

@martinhsv martinhsv merged commit fc8e558 into owasp-modsecurity:v2/master Jun 8, 2022
@marcstern
Copy link

There's a problem in that code: in case msr->xml->doc is equal to msr->xml->parsing_ctx->myDoc, the buffer is freed twice.
The correct code (which is running for long in our prod, sorry to have missed that difference) is
if (msr->xml->parsing_ctx->myDoc) {
xmlFreeDoc(msr->xml->parsing_ctx->myDoc);
if (msr->xml->parsing_ctx->myDoc == msr->xml->doc) msr->xml->doc = NULL;
}

@martinhsv
Copy link
Contributor

Hi @marcstern ,

It sounds like you're referencing the issue that I mentioned in my Dec. 31 comment.

Although that aspect is not dealt with in this PR, I did add that in the very next v2 commit: #2760

Or have I misunderstood you?

@marcstern
Copy link

Indeed, it's correct in master. I had a mismatch in my branches I guess. Sorry for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.x Related to ModSecurity version 2.x
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants