-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Painless functions return wrong output #89602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It looks like it has more to do with block scoping than it does with the function. This snippet also produces very incorrect results: ArrayList messages = new ArrayList();
int j = 0;
while(j < 5) {
double score =
decayNumericGauss(j, 1, 0, 0.5, j);
messages.add(score);
j++;
}
messages.add('\\n');
messages.add(decayNumericGauss(0, 1, 0, 0.5, 0));
messages.add(decayNumericGauss(1, 1, 0, 0.5, 1));
messages.add(decayNumericGauss(2, 1, 0, 0.5, 2));
messages.add(decayNumericGauss(3, 1, 0, 0.5, 3));
messages.add(decayNumericGauss(4, 1, 0, 0.5, 4));
String output = '';
int i = 0;
while(i < messages.length) {
output += messages[i] + '\\n';
i++;
}
return output; |
Looks like the problem might be in this file: https://github.com/elastic/elasticsearch/blob/main/server/src/main/java/org/elasticsearch/script/ScoreScriptUtils.java. I'm not too familiar with Java or the Elasticsearch codebase, but I would guess that the issue is related to caching the "origin" property on the class (and this is consistent with the behavior I'm observing). |
For now, as a workaround, I borrowed the implementation from that file, and hard-coded it into my script_score query. |
Pinging @elastic/es-core-infra (Team:Core/Infra) |
This is indeed an odd behaviour - the caching is there to implement optimisations in calling the same function parameters with different documents (see https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-script-score-query.html) Is there a specific use case that you found this issue with? |
@thecoop Hey Simon - thanks for your response! We are using Elasticsearch to search for containers of various sizes. There are scenarios where our users might want to search for "both 12 oz and 16 oz containers". In this scenario, we basically want to evaluate the products according to this pseudocode:
The current implementation doesn't accommodate this business requirement because the origin (decayNumericGauss's first parameter) is cached. I don't know if this issue is worth fixing (which would probably hurt performance), or if it would be better to just document this behavior somewhere with a workaround (implementing the function manually wasn't that difficult once I found the root cause of the issue). |
@Gkleinereva. That makes sense. It is currently known that these functions don't work in a loop - see #70437. It is not intended that these scoring functions are general-purpose, but only as part of a small score modification script, to take advantage of significant performance optimisations when running across many documents. The workaround is (as you've done) to implement the function yourself as a more general-purpose algorithm. If you've got suggestions/ideas for how these functions could be more general whilst still maintaining performance, do feel free to submit an issue for discussion or a PR. But currently these are working as intended. |
I've created #91496 to make it clear in the docs these functions don't work in a loop. Thanks for the bug report! |
Elasticsearch Version
8.3.1
Installed Plugins
None
Java Version
18.0.1.1
OS Version
Linux ip-172-31-1-195 5.13.0-1017-aws #19~20.04.1-Ubuntu SMP Mon Mar 7 12:53:12 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Problem Description
A simple wrapper for decayNumericGauss() causes the function to stop returning correct values when called.
I originally discovered this issue while writing a script_score query and submitting it via the API or in Kibana dev tools (the issue appears in both places). The reproduction steps below will use the "Painless Lab" editor for simplicity.
Steps to Reproduce
Open the "Painless Lab" editor
Copy in this code snippet
View results
Notice that the output is different from the direct calls to decayNumericGauss() than it is from calls to the wrapper. It looks like the value is being "cached" somehow. I know that the "Painless Lab" is still in beta, but this seemed to warrant opening a github issue since 1) Script_score queries are also affected and 2) Painless claims to support functions.
Logs (if relevant)
No response
The text was updated successfully, but these errors were encountered: