-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Add certain types of indirect function calls to the C++ call graph #7520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @gsingh93, Thanks for the issue. This is indeed something we would like to improve. For this specific case, your current solution is probably the best (or at least the easiest one). I don't think you'll get the behavior you want from The predicate we're currently working on improving is the one you get from importing import cpp
import semmle.code.cpp.ir.dataflow.ResolveCall
query predicate edges(Function a, Function b) {
// resolve any calls inside the `a` function.
resolveCall(any(Call call | call.getEnclosingFunction() = a)) = b
}
predicate getCallGraph(Function start, Function end, string startName, string endName) {
edges+(start, end) and
start.hasName(startName) and
end.hasName(endName)
}
from Function start, Function end
where getCallGraph(start, end, "entry_func", end.getName())
select start, end There are still some things missing from this library, however. For your example it doesn't give anything useful since it handles global variable initialization poorly (read: not at all). However, if you move the two function tables into void foo() {
uintptr_t func_table[] = {(uintptr_t)&func1, (uintptr_t)&func2};
for (int i = 0; i < sizeof(func_table) / sizeof(func_table[0]); i++) {
((void (*)())func_table[i])();
}
table_entry_t nested_func_table[] = {{0, (uintptr_t)&func3},
{1, (uintptr_t)&func4}};
for (int i = 0; i < sizeof(nested_func_table) / sizeof(nested_func_table[0]);
i++) {
((void (*)())nested_func_table[i].func)();
}
} you get this:
i.e., we now resolve the calls in the first loop. The calls in the second loop isn't resolved as it involves field read and writes, but you can hack around it by using a dataflow configuration like this: import cpp
import semmle.code.cpp.ir.dataflow.DataFlow
// A configuration for finding function accesses flowing into function-pointer calls
class Conf extends DataFlow::Configuration {
Conf() { this = "Conf" }
override predicate isSource(DataFlow::Node source) { source.asExpr() instanceof FunctionAccess }
override predicate isSink(DataFlow::Node sink) { sink.asExpr() = any(ExprCall call).getExpr() }
}
query predicate edges(Function a, Function b) {
exists(FunctionAccess funcAccess, DataFlow::Node sink |
// Flow from a function access to some sink (which is the expression of some `ExprCall`).
any(Conf conf).hasFlow(DataFlow::exprNode(funcAccess), sink) and
// And the call happens inside function `a`
sink.getEnclosingCallable() = a and
// And the function pointer is a pointer to function `b`
b = funcAccess.getTarget()
)
}
predicate getCallGraph(Function start, Function end, string startName, string endName) {
edges+(start, end) and
start.hasName(startName) and
end.hasName(endName)
}
from Function start, Function end
where getCallGraph(start, end, "entry_func", end.getName())
select start, end Again, because we're still improving the support for global variables on anything that starts by importing
I hope that helps! |
I'm dealing with a codebase that makes use of a lot calls to functions pointers stored in global/static arrays, which results in the call graph (
Function.calls
) not being very helpful. Here's a small example:When I ask CodeQL to get the functions
foo
calls, I'd like it to returnfunc1
,func2
,func3
, andfunc4
. I've solved this issue for now by implementing a class modeling function tables defined using array aggregate literals, which I'll show below, but I'm mainly opening this issue to ask whether CodeQL can add better support for these indirect pointers in the built-in call graph.In the example below, I've defined
FunctionTableArrayAggregateLiteral
to model function tables, and then I add a customedges
predicate which checks not only for direct function calls witha.calls(b)
, but also indirect calls through a function table (technically I don't actually verify if the function pointer is called, but this hasn't been a problem for me so far):Of course this can lead to false positives in certain cases (just because one function in a function table is used doesn't mean all of them are), so I don't expect this exact type of solution to be used, but maybe something like it can be considered.
The text was updated successfully, but these errors were encountered: