-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Add possible-forgotten-f-prefix
checker
#4787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This checks if text in strings in between `{}`'s are variables. The var the string is assigned to is also checked for a format() call. If this does not happen, the string should probably be a f-string and a message is emitted. This closes pylint-dev#2507
Pull Request Test Coverage Report for Build 1144158204
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this checker, much appreciated !
Co-authored-by: Pierre Sassoulas <[email protected]>
for more information, see https://pre-commit.ci
For future contributors stumbling across this PR. We would like to add a param = "string"
"This is a string" # good
"This is a {param}" # bad
f"This is a {param}" # good
"This is a calculation: 1 + 1" # good
"This is a calculation: {1 + 1}" # bad
f"This is a calculation: {1 + 1}" # good
"This is a nice string" # good
"This is a {"nice" + param}" # bad
f"This is a {'nice' + param}" # good The code in this PR is (almost) able to test the first three cases by looking for regex matches in between TLDR: How to evaluate the code in between |
Sorry to pop out from nowhere, but this PR got my interest. Have you read the f-string PEP? There are two interesting sections, Code equivalence and Expression evaluation, that shows how the f-string is treated internally, what's its representation with I think that expressions should be evaluated as done in the PEP, i.e. with If accepted, this should address every problem related with valid and invalid expressions inside curly braces. |
@Crissal1995 This is indeed interesting, thanks for the tip! I have not tested this, so please forgive me if I my questions are non-relevant, but:
|
Excuse me, but I don't understand this question. If you run PyLint, you are supposed to be in your own execution environment, so you know your secrets and have no interest to
This is indeed a good question; however, I think that should be as PyLint already works. It's not needed to execute all the code up until your expression; rather, your variable needs to be in the scope of local or global variables.
When I got a If I try to access an object variable, like |
As I said, I'm not quite sure how CI's and secrets behave but I am imagining a situation where an automatically run CI prints or sends an organization's secrets as a result of evaluating f-strings. I'm not sure if the is actually a feasible line of attack, but we should be 100% certain of this before merging such behaviour into the tool.
This seems like the right way forward. If I can find the time I might start working on this implementation, but I am more than happy for somebody else to start working on this.
I think we should do this check, as Pylint tries to inform users of errors before running their code. Relying on errors on runtime is therefore not truly in the spirit of pylint. |
I was thinking the same. Let's say a public open source project use pylint on Github and has a release job. In it's CI in a job, someone change a string to |
With
In fact, when you >>> import ast
>>> f1 = lambda x: ast.parse("(" + x + ")", "<fstring>", "eval")
>>> f2 = lambda x: compile(x, "<fstring>", "eval")
>>> eval(f1("1"))
TypeError: eval() arg 1 must be a string, bytes or code object
>>> eval(f2("1"))
1
>>> x = "MY SECRET"
>>> y = "print(x)"
>>> eval(f1(y))
TypeError: eval() arg 1 must be a string, bytes or code object
>>> eval(f2(y))
MY SECRET
But being Python a dynamically typed language, this turns out to be a difficult task, if not impossible. For example, when you |
I tried something with ast.parse, but I fail to find a test case where the returned parsed ast is not an expression. I think it create a syntax error the way it's parsed if it isn't one anyway, so we probably don't need to check that it's an ast.Expression. Other than that checking that a valid python expression contain a a variable from the global or local scope seems complicated in term of performance so we can probably not do it. What we probably need to do is to check that the string is not used with format later on because the number of false positive will be huge if we don't. |
851fea0
to
8def6c5
Compare
I was also toying around with |
possible-f-string-as-string
checkerpossible-forgotten-f-prefix
checker
I am not sure if we are already doing this, but having checks work for some code and not for others without clearly showing that the user is not something I am really in favour of. When I run the linter in VS Code I expect all messages I haven't disabled to be checked.
I never did, but this might be an option. |
I think this is a conversation about would we rather have false positive vs false negative and about probabilities here. Pylint is known to have annoying false positives so we're trying to lower their number (almost 75% of issue are about false positives). But no one complain about a false negative :) So as a maintainer I'd rather have less false positive. Now we can consider that creating a string for format is rather rare and they can be disabled (Like you did in pylint there are maybe 4 occurrence for 20 ksloc...) so false positive are ok in this case. To go into more detail I think the probability of creating a format string is higher on a non local string (it's a string that is going to be reused) and imo most of the false positive will be on such string. That's why I wanted to exclude the string that are not used locally from the check. Dou you have an opinion @cdce8p ? |
In general, I agree with @Pierre-Sassoulas here. We should do our best to prevent as many Regarding the specifics of f-strings, I'm not sure there is a good way to determine if it should be one or not. Thus this will likely lead to more false-positives than helpful warnings. @Pierre-Sassoulas mentioned assignments and I agree: Those should definitely be excluded. It's common to see code like this: TEMPLATE_STR = "Some string with a placeholder: {}"
print(TEMPLATE_STR.format("Hello World")) In the end, I don't know if we should move forward with it. In case you still would like to, it might be better to create a new extension for it. |
A new extension could also include the |
possible-forgotten-f-prefix
checkerpossible-forgotten-f-prefix
checker
I think |
While I agree For this PR to move forward the following tasks remain:
|
The performance benefit convinced me, I'm for keeping Thank you for opening the issue in astroid and making a checklist. It seems there is still a lot of work on this one so let's not block 2.11 because of it. |
Ok, this would indeed be an argument to keep it enabled by default. I would nevertheless recommend to move it to the --
The documentation for pyupgrade mentions some good arguments for when it should not be used: https://github.com/asottile/pyupgrade#f-strings note: pyupgrade is intentionally timid and will not create an f-string
if it would make the expression longer or if the substitution parameters
are anything but simple names or dotted names (as this can decrease readability). I wasn't aware that pyupgrade has already implemented |
Based on discussion in pylint-dev#4787
Based on discussion in #4787 Co-authored-by: Pierre Sassoulas <[email protected]>
I'm going to close this PR (sadly). The logic I worked on worked (sort of) but there were just too many false positives an edge cases that I would be comfortable merging it. I think pylint-dev/astroid#1156 would need to be fixed for me to ever start working on this again. |
To ease the process of reviewing your PR, do make sure to complete the following boxes.
doc/whatsnew/<current release.rst>
.Type of Changes
Description
This checks if text in strings in between
{}
's are variables.The var the string is assigned to is also checked for a format() call.
If this does not happen, the string should probably be a f-string and a message is emitted.
This closes #2507
Reason for this PR being a draft
Ideally I would also add a check for the following code:
Can we check that
1 + 1
is a calculation without callingeval
?