Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

External predicate recording multiple values #19140

Open
jghebre opened this issue Mar 28, 2025 · 3 comments
Open

External predicate recording multiple values #19140

jghebre opened this issue Mar 28, 2025 · 3 comments
Labels
awaiting-response The CodeQL team is awaiting further input or clarification from the original reporter of this issue. question Further information is requested

Comments

@jghebre
Copy link

jghebre commented Mar 28, 2025

Hi,

I'm using an external predicate to read in csv data and apply these to the overrided predicate step in DataFlow::SharedFlowStep. This is nice because it allows me to make changes to the entire Dataflow easily, which is what I want.

This works, but for certain queries I get this error:

Oops! A fatal internal error occurred. Details:
java.lang.IllegalStateException: Tried to record two values for the extensional patchInfo. Old value: {[0, 29, foo, example.js, 5, example.js, false]}. New value: {[0, 29, foo, example.js, 5, example.js, false]}
        at com.semmle.api.backend.ExtensionalValuesBase.put(ExtensionalValuesBase.java:82)
        at com.semmle.cli2.execute.ExecuteQueriesCommand.loadExtensionals(ExecuteQueriesCommand.java:395)
        at com.semmle.cli2.execute.ExecuteQueriesCommand$ExecutionIterator.startImpl(ExecuteQueriesCommand.java:235)
        at com.semmle.cli2.execute.ExecuteQueriesCommand$ExecutionIterator.start(ExecuteQueriesCommand.java:192)
        at com.semmle.cli2.execute.ExecuteQueriesCommand$ExecutionIterator.lambda$next$0(ExecuteQueriesCommand.java:187)
        at com.semmle.util.concurrent.FutureUtils.supplyCompose(FutureUtils.java:248)
        at com.semmle.util.concurrent.Paralleliser.lambda$startMoreJobs$3(Paralleliser.java:109)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.base/java.lang.Thread.run(Unknown Source)

I'm pretty positive this is occurring due to another file using DataFlow::SharedFlowStep::step which hits the external predicate more than once and causes errors.

I'm not sure if there is a way around this by perhaps:

  • only loading the external predicate once
  • forcing my class that overrides DataFlow::SharedFlowStep::step to run last
    or some other fix.

Thanks!

@jghebre jghebre added the question Further information is requested label Mar 28, 2025
@mbg
Copy link
Member

mbg commented Mar 28, 2025

Hi @jghebre 👋🏻

Do you have a minimal example for where this happens? (I.e. combination of CSV data and queries.) Also, can you confirm that you are using the most recent version of CodeQL? That would help us troubleshoot this.

@mbg mbg added the awaiting-response The CodeQL team is awaiting further input or clarification from the original reporter of this issue. label Mar 28, 2025
@jghebre
Copy link
Author

jghebre commented Mar 28, 2025

Hi @mbg ,

Sure I cut down my libraries to a minimal example, I'm using the latest CodeQL, v2.20.7:

Library for holding the csv data
PatchData.qll:

/**
 * Module to store patch data.
 */
module PatchData {
    // Some tabular data
    external predicate patchInfo(
      int id, int info
    );
  }

Library for using the csv data in Dataflow
PatchDataflow.qll:

import javascript
private import queries.lib.PatchData
  // use the tabular data to patch dataflow
  class Example extends DataFlow::SharedFlowStep {
    override predicate step(DataFlow::Node nodeFrom, DataFlow::Node nodeTo) {
      exists(int id |
        PatchData::patchInfo(id, _) 
      )
    }
  }

CWE-020 query with PatchDataflow library added
UntrustedDataToExternalAPI.qll:

/**
 * @name Untrusted data passed to external API
 * @description Data provided remotely is used in this external API without sanitization, which could be a security risk.
 * @id js/untrusted-data-to-external-api
 * @kind path-problem
 * @precision low
 * @problem.severity error
 * @security-severity 7.8
 * @tags security external/cwe/cwe-20
 */

// Add the library to patch dataflow
import queries.lib.PatchDataflow

import javascript
import semmle.javascript.security.dataflow.ExternalAPIUsedWithUntrustedDataQuery
import ExternalAPIUsedWithUntrustedDataFlow::PathGraph

from
  ExternalAPIUsedWithUntrustedDataFlow::PathNode source,
  ExternalAPIUsedWithUntrustedDataFlow::PathNode sink
where ExternalAPIUsedWithUntrustedDataFlow::flowPath(source, sink)
select sink, source, sink,
  "Call to " + sink.getNode().(Sink).getApiName() + " with untrusted data from $@.", source,
  source.toString()

Example csv
test.csv:

0,29

query used to run

codeql database analyze --format=sarif-latest --external=patchInfo=test.csv --output test.sarif test-db UntrustedDataToExternalAPI.ql --threads=0 --rerun

error message:

Oops! A fatal internal error occurred. Details:
java.lang.IllegalStateException: Tried to record two values for the extensional patchInfo. Old value: {[0, 29]}. New value: {[0, 29]}
        at com.semmle.api.backend.ExtensionalValuesBase.put(ExtensionalValuesBase.java:82)
        at com.semmle.cli2.execute.ExecuteQueriesCommand.loadExtensionals(ExecuteQueriesCommand.java:395)
        at com.semmle.cli2.execute.ExecuteQueriesCommand$ExecutionIterator.startImpl(ExecuteQueriesCommand.java:235)
        at com.semmle.cli2.execute.ExecuteQueriesCommand$ExecutionIterator.start(ExecuteQueriesCommand.java:192)
        at com.semmle.cli2.execute.ExecuteQueriesCommand$ExecutionIterator.lambda$next$0(ExecuteQueriesCommand.java:187)
        at com.semmle.util.concurrent.FutureUtils.supplyCompose(FutureUtils.java:248)
        at com.semmle.util.concurrent.Paralleliser.lambda$startMoreJobs$3(Paralleliser.java:109)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.base/java.lang.Thread.run(Unknown Source)

@mbg
Copy link
Member

mbg commented Mar 31, 2025

Thanks for that, @jghebre! That's really useful. I can confirm that I get the same error with your minimal example. I'll pass this on to the relevant engineering team.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting-response The CodeQL team is awaiting further input or clarification from the original reporter of this issue. question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants