Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing taint flow #19153

Open
Cheap-Cheer opened this issue Mar 30, 2025 · 4 comments
Open

Missing taint flow #19153

Cheap-Cheer opened this issue Mar 30, 2025 · 4 comments
Labels
Python question Further information is requested

Comments

@Cheap-Cheer
Copy link

Description of the false positive

Code samples or links to source code

URL to the alert on GitHub code scanning (optional)

I use the following query analysis code:


import python
import semmle.python.dataflow.new.DataFlow
import semmle.python.dataflow.new.TaintTracking
import semmle.python.dataflow.new.RemoteFlowSources
import semmle.python.Concepts
import semmle.python.ApiGraphs
module BackwardDataFlowConfiguration implements DataFlow::ConfigSig {
  predicate isSource(DataFlow::Node source) {
    source instanceof DataFlow::ExprNode
    and source.asExpr() instanceof Name
    and source.getLocation().getFile().getRelativePath() = "lollms/server/endpoints/lollms_personalities_infos.py"
    and exists(Name n | 
      n = source.asExpr() and n.getId() = "category")

    and source.getLocation().getStartLine() = 330

    }

  predicate isSink(DataFlow::Node sink) {
    sink instanceof DataFlow::Node
  }
}

module BackwardDataFlow = TaintTracking::Global<BackwardDataFlowConfiguration>;


from DataFlow::Node begin, DataFlow::Node end
where BackwardDataFlow::flow(begin, end)
select 
begin, 
"$@,$@"
,
begin.getLocation(),
"begin location"
,
end.getLocation(),
"end location"

this is the results:
Image
Starting from the category variable I specified, I tried to find all the nodes that it could potentially flow to. However, I only ended up with six results, which are marked in the figure above. Apparently, the package_full_path in the row where the sixth point is located is also a node that category could flow to, but the results don't reflect this. Why is that? Thank you for your answer!

@jketema jketema added question Further information is requested Python and removed false-positive labels Mar 31, 2025
@jketema jketema changed the title Can not Missing taint flow Mar 31, 2025
@jketema
Copy link
Contributor

jketema commented Mar 31, 2025

Hi @Cheap-Cheer

Thanks for your report. I've asked to CodeQL Python to take a look.

@jketema
Copy link
Contributor

jketema commented Mar 31, 2025

Hi @Cheap-Cheer

We only consider / to be a taint step from the right-hand side to the expression when the left-hand side can be determined through type tracking to be a Path.

The team has investigated this specific issue before for the LoLLMs repository. They concluded that in this specific case it is fairly complex to determine that personalities_zoo_path is in fact a Path, though might theoretically be possible. There are also similar simpler instances that a simpler tweak may cover (when a type annotation says that a function argument is a Path), but this likely wouldn't cover the specific case here.

@Cheap-Cheer
Copy link
Author

@jketema Thank you for your thoughtful answer! I really appreciate it!

@jketema
Copy link
Contributor

jketema commented Mar 31, 2025

You're welcome. To work around the issue, your best option would be to define an isAdditionalFlowStep predicate as part of BackwardDataFlowConfiguration that handles specific cases like these, or that just always propagates the right-hand side of a /. Note that the latter might give you many false positives in other places.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Python question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants