Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start tracking malloc #18771

Closed
ArtiomKr opened this issue Feb 13, 2025 · 10 comments
Closed

Start tracking malloc #18771

ArtiomKr opened this issue Feb 13, 2025 · 10 comments
Labels
question Further information is requested

Comments

@ArtiomKr
Copy link

ArtiomKr commented Feb 13, 2025

How do I start tracking a variable that has been assigned a malloc value?

#include <stdlib.h>
#include <time.h>

int main() {
  char *a = (char *)malloc(sizeof(char)); // Memory allocation
  char *b = a;
  if (a != NULL) {
    free(a); // Free allocated memory
  }
  *b = 'b'; // Use after free
  return 0;
}

In this code I can't track all malloc results. For example, here isSource doen't find char *a = (char *)malloc(sizeof(char)). I want to find all the re-assignments of the pointers and then track the use after free vulnerability.

/**
 * @name Use after free
 * @kind path-problem
 * @id cpp/use-after-free
 */
import cpp
import semmle.code.cpp.dataflow.DataFlow
import semmle.code.cpp.dataflow.TaintTracking
import Configs::PathGraph

module Config implements DataFlow::ConfigSig {

  predicate isSource(DataFlow::Node arg) {
    exists(Assignment stmt, FunctionCall call |
      arg.asExpr() = stmt.getLValue() and
      call.getTarget().hasGlobalOrStdName("malloc") and
      stmt.getRValue() = call
    )
  }

  predicate isSink(DataFlow::Node sink) {
    dereferenced(sink.asExpr())
  }
}

module Configs = TaintTracking::Global<Config>;

from Configs::PathNode source, Configs::PathNode sink
where Configs::hasFlowPath(source, sink)
select sink, source, sink,
  "Memory is freed here and used here, causing a potential vulnerability.",
  source, "freed here", sink, "used here"
@ArtiomKr ArtiomKr added the question Further information is requested label Feb 13, 2025
@jketema
Copy link
Contributor

jketema commented Feb 13, 2025

Hi @ArtiomKr,

You probably just want to write the source as:

  predicate isSource(DataFlow::Node arg) {
    exists(FunctionCall call |
        arg.asExpr() = call and
        call.getTarget().hasGlobalOrStdName("malloc")
    )
  }

That also handles cases where the return value of malloc is not directly assigned to a variable.

@ArtiomKr
Copy link
Author

ArtiomKr commented Feb 13, 2025

Hi @jketema
So, it turns out that I won't be able to track the process of assigning a pointer to a new variable? For example, what I want is:

#include <stdlib.h>
#include <time.h>

int main() {
  char *a = (char *)malloc(sizeof(char)); // Memory allocation //start tracking 
  char *b = a; //start tracking  *b
  if (a != NULL) {
    free(a); // Free allocated memory //here is free
  }
  *b = 'b'; // Use after free // is sink
  return 0;
}

I am sorry, for such questions, but it really difficult to find such CWE. That is, I want to search here, starting with malloc, for a possible vulnerability to use after free, but I don't know exactly how to do it. If I make malloc and free as sources, that will do?

@ArtiomKr
Copy link
Author

ArtiomKr commented Feb 13, 2025

import cpp
import semmle.code.cpp.dataflow.DataFlow
import semmle.code.cpp.dataflow.TaintTracking
import Configs::PathGraph

module Config implements DataFlow::ConfigSig {
  predicate isSource(DataFlow::Node arg) {
    exists(FunctionCall call |
        arg.asExpr() = call and
        call.getTarget().hasGlobalOrStdName("malloc")
    )
  }

  predicate isSink(DataFlow::Node sink) {
    dereferenced(sink.asExpr())
  }

  predicate isFreed(DataFlow::Node arg) {
    exists(FunctionCall freeCall |
        arg.asExpr() = freeCall and
        freeCall.getTarget().hasGlobalOrStdName("free")
    )
  }
}
module Configs = TaintTracking::Global<Config>;
from Configs::PathNode source, Configs::PathNode sink
where Configs::hasFlowPath(source, sink) and
      Configs::hasFlowPath(isFreed(source), sink)
      
select sink, source, sink,
  "Memory is freed here and used here, causing a potential vulnerability.",
  source, "freed here", sink, "used here"

This code I tried, but it didn't work. Could not resolve predicate isFreed/1

@jketema
Copy link
Contributor

jketema commented Feb 13, 2025

So, it turns out that I won't be able to track the process of assigning a pointer to a new variable?

With the dataflow library you're using you cannot track both the variable you're assigning to and the variable that got assigned. The library is a def-use dataflow library, so that's impossible. You're likely going to need at least 3 dataflow configurations:

  • one that tracks malloc to the point of the alias assignment
  • one that tracks the variable assigned to, to some point where it is used
  • one that tracks malloc to a free

On top of that you will need reasoning that show that all the sinks you find are related, and likely a substantial amount of control flow reasoning that shows that the use actually occurs after the free.

@ArtiomKr
Copy link
Author

ArtiomKr commented Feb 13, 2025

import cpp
import semmle.code.cpp.dataflow.DataFlow
import semmle.code.cpp.dataflow.TaintTracking
import Config1s::PathGraph

module Config1 implements DataFlow::ConfigSig {

  predicate isSource(DataFlow::Node arg) {
    exists(FunctionCall call |
        arg.asExpr() = call and
        call.getTarget().hasGlobalOrStdName("malloc")
    )
  }

  predicate isSink(DataFlow::Node sink) {
    dereferenced(sink.asExpr())
  }
}

import Config2s::PathGraph
 
module Config2 implements DataFlow::ConfigSig {
  predicate isSource(DataFlow::Node arg) {
    exists(FunctionCall call |
      arg.asDefiningArgument() = call.getArgument(0) and
      call.getTarget().hasGlobalOrStdName("free")
    )
  }

  predicate isSink(DataFlow::Node sink) {
   exists(PointerDereferenceExpr star |
    star.getOperand() = sink.asExpr()
   )
   or
   exists(FormattingFunctionCall call |
    call.getArgument(0) = sink.asExpr()
   )
  }
}

module Config2s = TaintTracking::Global<Config2>;
module Config1s = TaintTracking::Global<Config1>;

from Config1s::PathNode source, Config1s::PathNode sink
where Config1s::hasFlowPath(source, sink)
select sink, source, sink,
  "Memory is freed here and used here, causing a potential vulnerability.",
  source, "freed here", sink, "used here"

It turns out that I have to do something like two or three dataflow configurations to link the logic between several analyses together. But the problem is that these analyses have completely different sources and sinks. How should they interact together?

@jketema
Copy link
Contributor

jketema commented Feb 13, 2025

I would dig out the underlying expression from the nodes, e.g., source.getNode().asExpr(). Those you can compare.

@ArtiomKr
Copy link
Author

So, now I don't understand, how it should work...

@jketema
Copy link
Contributor

jketema commented Feb 13, 2025

At this point I would strongly suggest you start with writing some simpler queries, maybe study some queries that use multiple configurations, and have a look at our use-after-free query: https://github.com/github/codeql/blob/main/cpp/ql/src/Critical/UseAfterFree.ql.

@ArtiomKr
Copy link
Author

@jketema
Well. There is no example to find such CWE?

#include <stdlib.h>
#include <time.h>

int main() {
  char *a = (char *)malloc(sizeof(char)); // Memory allocation
  char *b = a;
  if (a != NULL) {
    free(a); // Free allocated memory
  }
  *b = 'b'; // Use after free
  return 0;
}

@ArtiomKr
Copy link
Author

Thanks, I found out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants