-
Notifications
You must be signed in to change notification settings - Fork 496
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added a KeywordDetector plugin #76
Conversation
Added _remove_password_secrets_if_line_reported_already Because is often the case that e.g. SUPER_SECRET_VALUE = 'c3VwZXIgbG9uZyBzdHJ' is reported both by the PasswordDetector and another plugin. To minimize diff size, we will simply not report findings from the PasswordDetector if another plugin reports a secret on the same line.
detect_secrets/plugins/password.py
Outdated
@@ -0,0 +1,68 @@ | |||
""" | |||
This code was extracted in part from | |||
https://github.com/PyCQA/bandit. Using similar heuristic logic, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greetings @ericwb and @viraptor, you both seem like the primary maintainers of Bandit.
This code is largely borrowed from plugins/general_hardcoded_password.py and I've tried my best to reference it as such per the APACHE license of Bandit. Can you please give this your blessing, or let me know if it's not to your liking?
Cheers,
Kevin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks 👌 to me. Thank you very much for checking!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
detect_secrets/plugins/password.py
Outdated
) | ||
|
||
|
||
class PasswordDetector(BasePlugin): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
KeywordDetector
?
detect_secrets/plugins/password.py
Outdated
|
||
|
||
class PasswordDetector(BasePlugin): | ||
"""This checks for private keys by determining whether the blacklisted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops. copy pasta?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed
self.data[filename].update(results) | ||
self.data[filename].update(file_results) | ||
|
||
def _remove_password_secrets_if_line_reported_already( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of removing, why don't we change our baseline schema to display types in list form? This way, we can preserve all the intelligence that we gather on a particular flagged secret.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it would break all the existing baseline's we made (like when we changed the high entropy string types). I'm not sure how hard that would be to implement though, I can look into that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, good point. Though, arguably a step in the right direction?
Thankfully, we built things for doing better version bumps, since the last time. We would need to update the merge_baseline function, and the pre-commit hook should get all active clients slowly upgrading.
The only gotchas I see are:
- Server needs to be updated (and handle stale clients through detected baseline versions)
- We need to handle this case in audit.
Perhaps the fix is merely to implement the dedup logic in auditing (as it's the only place that I can find that uses the type
output from the baseline)?
e.g.
def to_list(value):
try:
iter(value)
return value
except ValueError:
return [value]
...
plugin = initialize.from_secret_type(
to_list(secret['type']),
plugin_settings,
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: Instead of removing, why don't we change our baseline schema to display types in list form?
This is gonna lead to some super duper ugly code b/c a lot of the code was built around how a single plugin generates a PotentialSecret
. I think it is okay to write, but I'm just mentioning b/c it'll get pretty ugly. 😁
e.g. we would still want to yield
each individual plugin in _results_accumulator
, and so after changing the type
attribute of a PotentialSecret
to be a list, I would do what I'm doing now, except in addition to removing, I would also add the type to the secret reported by another plugin:
for secret in file_results:
if secret.type == 'Password':
password_secrets.append(secret)
else:
line_numbers_of_other_plugins.add(secret.lineno)
for password_secret in password_secrets:
if password_secret.lineno in line_numbers_of_other_plugins:
del file_results[password_secret]
+ for secret in file_results:
+ if password_secret.lineno == secret.lineno:
+ secret.type.append('Password')
If you can give that your blessing I'm happy to make the types into a list 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you said Perhaps the fix is merely to implement the dedup logic in auditing
, did you mean you don't mind having all the Password
ones separate from the others in the baseline? We might double our baseline sizes but I think I'm leaning towards that a little :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Stating for the record, discussed in person yesterday, that we will just take the baseline size increase, and punt of de-duping in the audit functionality until later.)
…lugins report that same line, change PotentialSecret .type back to a single string
Maybe add a note to the README that this at least attempts to block these now? Otherwise looks good to me. |
Good point! |
…e_list new metabolizer json modify
Our version of https://github.com/PyCQA/bandit/blob/master/bandit/plugins/general_hardcoded_password.py#L24