-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python: Modernize the Loop Variable Capture query #19165
Open
joefarebrother
wants to merge
11
commits into
github:main
Choose a base branch
from
joefarebrother:python-qual-loop-var-capture
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
efdb4a6
Use global dataflow for loop variable capture
joefarebrother 08b4281
Update query message and remove field case
joefarebrother 5b7200a
Use flow path in alerts
joefarebrother 11830bf
Move to separate folder
joefarebrother 2d6476a
Update names and alert message
joefarebrother c37809a
Reduce scope of allowImplicitRead to avoid cartesian product.
joefarebrother adfe89f
Update test output
joefarebrother 9fb1c31
Update tests to inline expectations
joefarebrother b580550
Cleanups
joefarebrother de7e611
Rewrite documentation
joefarebrother e08072d
Fix qhelp formatting
joefarebrother File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
47 changes: 47 additions & 0 deletions
47
python/ql/src/Variables/LoopVariableCapture/LoopVariableCapture.qhelp
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
<!DOCTYPE qhelp PUBLIC | ||
"-//Semmle//qhelp//EN" | ||
"qhelp.dtd"> | ||
<qhelp> | ||
|
||
<overview> | ||
<p> | ||
In Python, a nested function or lambda expression that captures a variable from its surrounding scope is a <em>late-binding</em> closure, | ||
meaning that the value of the variable is determined when the closure is called, not when it is created. | ||
</p> | ||
<p> | ||
Care must be taken when the captured variable is a loop variable. If the closure is called after the loop ends, it will use the value of the variable on the last iteration of the loop, rather than the value at the iteration at which it was created. | ||
</p> | ||
|
||
</overview> | ||
<recommendation> | ||
<p> | ||
Ensure that closures that capture loop variables aren't used outside of a single iteration of the loop. | ||
To capture the value of a loop variable at the time the closure is created, use a default parameter, or <code>functools.partial</code>. | ||
</p> | ||
|
||
</recommendation> | ||
<example> | ||
<p> | ||
In the following (BAD) example, a `tasks` list is created, but each task captures the loop variable <code>i</code>, and reads the same value when run. | ||
</p> | ||
<sample src="examples/bad.py" /> | ||
<p> | ||
In the following (GOOD) example, each closure has an `i` default parameter, shadowing the outer <code>i</code> variable, the default value of which is determined as the value of the loop variable <code>i</code> at the time the closure is created. | ||
</p> | ||
<sample src="examples/good.py" /> | ||
<p> | ||
In the following (GOOD) example, <code>functools.partial</code> is used to partially evaluate the lambda expression with the value of <code>i</code>. | ||
</p> | ||
<sample src="examples/good2.py" /> | ||
|
||
|
||
|
||
</example> | ||
<references> | ||
<li>The Hitchhiker's Guide to Python: <a href="http://docs.python-guide.org/en/latest/writing/gotchas/#late-binding-closures">Late Binding Closures</a>.</li> | ||
<li>Python Language Reference: <a href="https://docs.python.org/reference/executionmodel.html#naming-and-binding">Naming and binding</a>.</li> | ||
<li>Stack Overflow: <a href="https://stackoverflow.com/questions/3431676/creating-functions-or-lambdas-in-a-loop-or-comprehension">Creating functions (or lambdas) in a loop (or comprehension)</a>.</li> | ||
<li>Python Language Reference: <a href="https://docs.python.org/3/library/functools.html#functools.partial">functools.partial</a>.</li> | ||
|
||
</references> | ||
</qhelp> |
25 changes: 25 additions & 0 deletions
25
python/ql/src/Variables/LoopVariableCapture/LoopVariableCapture.ql
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
/** | ||
* @name Loop variable capture | ||
* @description Capture of a loop variable is not the same as capturing the value of a loop variable, and may be erroneous. | ||
* @kind path-problem | ||
* @tags correctness | ||
* quality | ||
* @problem.severity error | ||
* @sub-severity low | ||
* @precision high | ||
* @id py/loop-variable-capture | ||
*/ | ||
|
||
import python | ||
import LoopVariableCaptureQuery | ||
import EscapingCaptureFlow::PathGraph | ||
|
||
from | ||
CallableExpr capturing, AstNode loop, Variable var, string descr, | ||
EscapingCaptureFlow::PathNode source, EscapingCaptureFlow::PathNode sink | ||
where | ||
escapingCapture(capturing, loop, var, source, sink) and | ||
if capturing instanceof Lambda then descr = "lambda" else descr = "function" | ||
select capturing, source, sink, | ||
"This " + descr + " captures the loop variable $@, and may escape the loop by being stored at $@.", | ||
loop, var.getId(), sink, "this location" | ||
File renamed without changes.
81 changes: 81 additions & 0 deletions
81
python/ql/src/Variables/LoopVariableCapture/LoopVariableCaptureQuery.qll
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,81 @@ | ||
/** Definitions for reasoning about loop variable capture issues. */ | ||
|
||
import python | ||
import semmle.python.dataflow.new.DataFlow | ||
|
||
/** A looping construct. */ | ||
abstract class Loop extends AstNode { | ||
/** | ||
* Gets a loop variable of this loop. | ||
* For example, `x` and `y` in `for x,y in pairs: print(x+y)` | ||
*/ | ||
abstract Variable getALoopVariable(); | ||
} | ||
|
||
/** A `for` loop. */ | ||
private class ForLoop extends Loop, For { | ||
override Variable getALoopVariable() { | ||
this.getTarget() = result.getAnAccess().getParentNode*() and | ||
result.getScope() = this.getScope() | ||
} | ||
} | ||
|
||
/** Holds if the callable `capturing` captures the variable `var` from the loop `loop`. */ | ||
predicate capturesLoopVariable(CallableExpr capturing, Loop loop, Variable var) { | ||
var.getAnAccess().getScope() = capturing.getInnerScope() and | ||
capturing.getParentNode+() = loop and | ||
var = loop.getALoopVariable() | ||
} | ||
|
||
/** Dataflow configuration for reasoning about callables that capture a loop variable and then may escape from the loop. */ | ||
module EscapingCaptureFlowConfig implements DataFlow::ConfigSig { | ||
predicate isSource(DataFlow::Node node) { capturesLoopVariable(node.asExpr(), _, _) } | ||
|
||
predicate isSink(DataFlow::Node node) { | ||
// Stored in a dict/list. | ||
exists(Assign assign, Subscript sub | | ||
sub = assign.getATarget() and node.asExpr() = assign.getValue() | ||
) | ||
or | ||
// Stored in a list. | ||
exists(DataFlow::MethodCallNode mc | mc.calls(_, "append") and node = mc.getArg(0)) | ||
or | ||
// Used in a yield statement, likely included in a collection. | ||
// The element of comprehension expressions desugar to involve a yield statement internally. | ||
exists(Yield y | node.asExpr() = y.getValue()) | ||
// Checks for storing in a field leads to false positives, so are omitted. | ||
} | ||
|
||
predicate isBarrierOut(DataFlow::Node node) { isSink(node) } | ||
|
||
predicate isBarrier(DataFlow::Node node) { | ||
// Incorrect virtual dispatch to __call__ methods is a source of FPs. | ||
exists(Function call | | ||
call.getName() = "__call__" and | ||
call.getArg(0) = node.(DataFlow::ParameterNode).getParameter() | ||
) | ||
} | ||
|
||
predicate allowImplicitRead(DataFlow::Node node, DataFlow::ContentSet cs) { | ||
isSink(node) and | ||
( | ||
cs instanceof DataFlow::TupleElementContent or | ||
cs instanceof DataFlow::ListElementContent or | ||
cs instanceof DataFlow::SetElementContent or | ||
cs instanceof DataFlow::DictionaryElementAnyContent | ||
) | ||
} | ||
} | ||
|
||
/** Dataflow for reasoning about callables that capture a loop variable and then escape from the loop. */ | ||
module EscapingCaptureFlow = DataFlow::Global<EscapingCaptureFlowConfig>; | ||
|
||
/** Holds if `capturing` is a callable that captures the variable `var` of the loop `loop`, and then may escape the loop via a flow path from `source` to `sink`. */ | ||
predicate escapingCapture( | ||
CallableExpr capturing, Loop loop, Variable var, EscapingCaptureFlow::PathNode source, | ||
EscapingCaptureFlow::PathNode sink | ||
) { | ||
capturesLoopVariable(capturing, loop, var) and | ||
capturing = source.getNode().asExpr() and | ||
EscapingCaptureFlow::flowPath(source, sink) | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
# BAD: The loop variable `i` is captured. | ||
tasks = [] | ||
for i in range(5): | ||
tasks.append(lambda: print(i)) | ||
|
||
# This will print `4,4,4,4,4`, rather than `0,1,2,3,4` as likely intended. | ||
for t in tasks: | ||
t() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
# GOOD: A default parameter is used, so the variable `i` is not being captured. | ||
tasks = [] | ||
for i in range(5): | ||
tasks.append(lambda i=i: print(i)) | ||
|
||
# This will print `0,1,2,3,4``. | ||
for t in tasks: | ||
t() |
9 changes: 9 additions & 0 deletions
9
python/ql/src/Variables/LoopVariableCapture/examples/good2.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
import functools | ||
# GOOD: A default parameter is used, so the variable `i` is not being captured. | ||
tasks = [] | ||
for i in range(5): | ||
tasks.append(functools.partial(lambda i: print(i), i)) | ||
|
||
# This will print `0,1,2,3,4``. | ||
for t in tasks: | ||
t() |
8 changes: 0 additions & 8 deletions
8
python/ql/test/query-tests/Variables/capture/LoopVariableCapture.expected
This file was deleted.
Oops, something went wrong.
1 change: 0 additions & 1 deletion
1
python/ql/test/query-tests/Variables/capture/LoopVariableCapture.qlref
This file was deleted.
Oops, something went wrong.
Empty file.
19 changes: 19 additions & 0 deletions
19
python/ql/test/query-tests/Variables/capture/LoopVariableCaptureTest.ql
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
import python | ||
import Variables.LoopVariableCapture.LoopVariableCaptureQuery | ||
import utils.test.InlineExpectationsTest | ||
|
||
module MethodArgTest implements TestSig { | ||
string getARelevantTag() { result = "capturedVar" } | ||
|
||
predicate hasActualResult(Location location, string element, string tag, string value) { | ||
exists(CallableExpr capturing, Variable var | | ||
escapingCapture(capturing, _, var, _, _) and | ||
element = capturing.toString() and | ||
location = capturing.getLocation() and | ||
tag = "capturedVar" and | ||
value = var.getId() | ||
) | ||
} | ||
} | ||
|
||
import MakeTest<MethodArgTest> |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Check warning
Code scanning / CodeQL
Alert message style violation Warning