codeql js taint tracking - write "recursive" additional taint step #19098

DSimsek000 · 2025-03-24T10:41:41Z

I have the following code:

function source() {
    fn0();
    fn1();
}

function fn0() {
    const a = process.env.SECRET;
    sink(a);
}

function fn1() {
    fn0();
}

In this example everything coming from the env is tainted. However, I do not want to find a taint path that starts at fn0 (where we have the env propread), but from source.

I came up with the following config that uses isAdditionalFlowStep to check whether the current function has a call to a function where an env propread is made:

module Config implements DataFlow::ConfigSig {
  predicate isSource(DataFlow::Node source) {
    exists(Function f |
      f.getName() = "source" and
      (
        source.asExpr() = f.getAParameter()
        or
        source.(DataFlow::FunctionNode).getFunction() = f
      )
    )
  }

  predicate isSink(DataFlow::Node node) {
    exists(DataFlow::CallNode cn |
      cn.getAnArgument() = node and
      cn.getCalleeName() = "sink"
    )
  }

  predicate isAdditionalFlowStep(DataFlow::Node fromNode, DataFlow::Node toNode) {
    exists(DataFlow::FunctionNode fnEnv, DataFlow::PropRead propRead |
      propRead = toNode and
      propRead.getPropertyName() = "SECRET" and
      propRead.asExpr().getEnclosingFunction() = fnEnv.getFunction() and
      fromNode.(DataFlow::InvokeNode).getACallee() = fnEnv.getFunction()
    )
    or
    toNode.(DataFlow::InvokeNode).getEnclosingFunction() =
      fromNode.(DataFlow::FunctionNode).getFunction()
  }
}

This finds the taint path source -> fn0, but not source->fn1->fn0.
I am not sure how to restructure the predicate to introduce these taint steps in a recursive manner.

The text was updated successfully, but these errors were encountered:

mbg · 2025-03-24T11:50:32Z

Hi @DSimsek000 👋🏻

It sounds like you want to identify functions as sources which (transitively) call a function that reads an environment variable?

I'd suggest starting with a predicate to identify functions which access environment variables (based on what you have in your isAdditionalFlowStep):

predicate readsFromEnvironment(Function f, PropAccess p) {
  p.getPropertyName() = "SECRET" and
  p.getEnclosingFunction() = f
}

You can write a predicate to check if one function calls another:

Function getACaller(Function callee) {
  exists(DataFlow::InvokeNode node |
    node.getACallee() = callee and
    node.getEnclosingFunction() = result
  )
}

And then find functions which call such a function:

from Function caller, Function callee
where
  readsFromEnvironment(callee, _) and
  caller = getACaller+(callee)
select caller, callee

This gets you two results (source -> fn0 and fn1 -> fn0). Note that source -> fn0 covers both the path directly to fn0 as well as via fn1.

If you further want to see every path from a function that (transitively) calls another that reads from environment variables, then you could look at CallGraph.qll for inspiration.

DSimsek000 · 2025-03-24T14:16:54Z

Hey @mbg, thanks for the suggestion! That definitely helps, and I also just noticed the bug in my isAdditionalFlowStep predicate that I shared above. I forgot to add the taint propagator from CallNode to FunctionNode:

toNode.(DataFlow::FunctionNode).getFunction() = fromNode.(DataFlow::InvokeNode).getACallee()

I think thats why source -> fn1 -> fn0 was not shown.

Here is the fixed version that "preserves" this information about the call graph in a path query. I am not sure if this is the best way to solve the problem, but a quick test showed that it seems to work in the example above.

predicate isAdditionalFlowStep(DataFlow::Node fromNode, DataFlow::Node toNode) {
     (
      exists(DataFlow::FunctionNode fnEnv, DataFlow::PropRead propRead |
        propRead = toNode and
        propRead.getPropertyName() = "SECRET" and
        propRead.asExpr().getEnclosingFunction() = fnEnv.getFunction() and
        fromNode.(DataFlow::InvokeNode).getACallee() = fnEnv.getFunction()
      )
      or
      toNode.(DataFlow::InvokeNode).getEnclosingFunction() =
        fromNode.(DataFlow::FunctionNode).getFunction()
      or
      toNode.(DataFlow::FunctionNode).getFunction() = fromNode.(DataFlow::InvokeNode).getACallee()
    )
  }

DSimsek000 added the question Further information is requested label Mar 24, 2025

mbg self-assigned this Mar 24, 2025

mbg added the javascript Pull requests that update Javascript code label Mar 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

codeql js taint tracking - write "recursive" additional taint step #19098

codeql js taint tracking - write "recursive" additional taint step #19098

DSimsek000 commented Mar 24, 2025

mbg commented Mar 24, 2025

DSimsek000 commented Mar 24, 2025

codeql js taint tracking - write "recursive" additional taint step #19098

codeql js taint tracking - write "recursive" additional taint step #19098

Comments

DSimsek000 commented Mar 24, 2025

mbg commented Mar 24, 2025

DSimsek000 commented Mar 24, 2025