Skip to content

Painless functions return wrong output #89602

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Gkleinereva opened this issue Aug 24, 2022 · 8 comments
Closed

Painless functions return wrong output #89602

Gkleinereva opened this issue Aug 24, 2022 · 8 comments
Labels
>bug :Core/Infra/Scripting Scripting abstractions, Painless, and Mustache Team:Core/Infra Meta label for core/infra team

Comments

@Gkleinereva
Copy link

Elasticsearch Version

8.3.1

Installed Plugins

None

Java Version

18.0.1.1

OS Version

Linux ip-172-31-1-195 5.13.0-1017-aws #19~20.04.1-Ubuntu SMP Mon Mar 7 12:53:12 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Problem Description

A simple wrapper for decayNumericGauss() causes the function to stop returning correct values when called.

I originally discovered this issue while writing a script_score query and submitting it via the API or in Kibana dev tools (the issue appears in both places). The reproduction steps below will use the "Painless Lab" editor for simplicity.

Steps to Reproduce

Open the "Painless Lab" editor

Copy in this code snippet

double getScore(double targetVolume, double tolerance, double offset, double decay, double volume) {
  return decayNumericGauss(targetVolume, tolerance, offset, decay, volume);
}

ArrayList messages = new ArrayList();

messages.add(decayNumericGauss(147.8675, 51.753625, 0, 0.5, 150.0));
messages.add(decayNumericGauss(473.176, 165.61159999999998, 0, 0.5, 150.0));
messages.add('\\n');
messages.add(getScore(147.8675, 51.753625, 0, 0.5, 150.0));
messages.add(getScore(473.176, 165.61159999999998, 0, 0.5, 150.0));

String output = '';
int i = 0;
while(i < messages.length) {
  output += messages[i] + '\\n';
  i++;
}
return output;

View results

Notice that the output is different from the direct calls to decayNumericGauss() than it is from calls to the wrapper. It looks like the value is being "cached" somehow. I know that the "Painless Lab" is still in beta, but this seemed to warrant opening a github issue since 1) Script_score queries are also affected and 2) Painless claims to support functions.

Logs (if relevant)

No response

@Gkleinereva Gkleinereva added >bug needs:triage Requires assignment of a team area label labels Aug 24, 2022
@Gkleinereva
Copy link
Author

It looks like it has more to do with block scoping than it does with the function. This snippet also produces very incorrect results:

ArrayList messages = new ArrayList();

int j = 0;
while(j < 5) {
    double score = 
        decayNumericGauss(j, 1, 0, 0.5, j);
    messages.add(score);
    j++;
}

messages.add('\\n');
messages.add(decayNumericGauss(0, 1, 0, 0.5, 0));
messages.add(decayNumericGauss(1, 1, 0, 0.5, 1));
messages.add(decayNumericGauss(2, 1, 0, 0.5, 2));
messages.add(decayNumericGauss(3, 1, 0, 0.5, 3));
messages.add(decayNumericGauss(4, 1, 0, 0.5, 4));

String output = '';
int i = 0;
while(i < messages.length) {
  output += messages[i] + '\\n';
  i++;
}
return output;

@Gkleinereva
Copy link
Author

Looks like the problem might be in this file: https://github.com/elastic/elasticsearch/blob/main/server/src/main/java/org/elasticsearch/script/ScoreScriptUtils.java. I'm not too familiar with Java or the Elasticsearch codebase, but I would guess that the issue is related to caching the "origin" property on the class (and this is consistent with the behavior I'm observing).

@Gkleinereva
Copy link
Author

For now, as a workaround, I borrowed the implementation from that file, and hard-coded it into my script_score query.

@mayya-sharipova mayya-sharipova added :Core/Infra/Scripting Scripting abstractions, Painless, and Mustache and removed needs:triage Requires assignment of a team area label labels Aug 24, 2022
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@elasticsearchmachine elasticsearchmachine added the Team:Core/Infra Meta label for core/infra team label Aug 24, 2022
@thecoop
Copy link
Member

thecoop commented Nov 9, 2022

This is indeed an odd behaviour - the caching is there to implement optimisations in calling the same function parameters with different documents (see https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-script-score-query.html)

Is there a specific use case that you found this issue with?

@Gkleinereva
Copy link
Author

@thecoop Hey Simon - thanks for your response!

We are using Elasticsearch to search for containers of various sizes. There are scenarios where our users might want to search for "both 12 oz and 16 oz containers". In this scenario, we basically want to evaluate the products according to this pseudocode:

docScore = Math.max(decayNumericGauss(origin = 12), decayNumericGauss(origin = 16));

The current implementation doesn't accommodate this business requirement because the origin (decayNumericGauss's first parameter) is cached.

I don't know if this issue is worth fixing (which would probably hurt performance), or if it would be better to just document this behavior somewhere with a workaround (implementing the function manually wasn't that difficult once I found the root cause of the issue).

@thecoop
Copy link
Member

thecoop commented Nov 10, 2022

@Gkleinereva. That makes sense. It is currently known that these functions don't work in a loop - see #70437. It is not intended that these scoring functions are general-purpose, but only as part of a small score modification script, to take advantage of significant performance optimisations when running across many documents.

The workaround is (as you've done) to implement the function yourself as a more general-purpose algorithm.

If you've got suggestions/ideas for how these functions could be more general whilst still maintaining performance, do feel free to submit an issue for discussion or a PR. But currently these are working as intended.

@thecoop
Copy link
Member

thecoop commented Nov 10, 2022

I've created #91496 to make it clear in the docs these functions don't work in a loop. Thanks for the bug report!

@thecoop thecoop closed this as completed Nov 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Core/Infra/Scripting Scripting abstractions, Painless, and Mustache Team:Core/Infra Meta label for core/infra team
Projects
None yet
Development

No branches or pull requests

4 participants