-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Using inner_hits on nested query causes an index_out_of_bounds_exception #25315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@BLZB0B Are you able to share the 62 documents that matched with that query? (sending it me privately is good too) This would make it easier to reproduce the error. I haven't been able yet to figure out what the cause is. |
Hi Martijn, I've sent a link to your gmail account.. Regards Phil |
@BLZB0B I've taken a look at the document that causes this error and your mapping and the reason this happens is because the logic that extracts the relevant nested part from the In your case the I think the nested source extraction logic can be fixed to flatten all the levels that don't use a nested field mapper, but it is going to make the this logic more complicated and unfortunately it is already complicated. I currently lean towards throwing a descriptive error (including a hint with a workaround) in the case of when nested fields are wrapped by regular object fields. Also if we in the future decide to change how the source is stored (as is descibed in #9034) then the extraction of the nested source is no longer needed. There are two workarounds:
|
Thanks Martijn, I've updated our mapping as suggested and are getting results back with "inner_hits": { } rather than the error mentioned above. A more useful error message would help as you suggested. I have a question related to the above but not part of the bug, so will switch to the forum rather than ask here if that's OK. Regards Phil |
This was discussed in the fix it friday meeting and there was agreement on making the nested source extraction not more complicated than it already is. So we should throw a descriptive error and document the two possible workarounds. |
Thanks @martijnvg |
…has a object field as parent field and that parent field is defined as an array field in the _source of the document inner hits are being computed for. Closes #25315
…has a object field as parent field and that parent field is defined as an array field in the _source of the document inner hits are being computed for. Closes #25315
…has a object field as parent field and that parent field is defined as an array field in the _source of the document inner hits are being computed for. Closes #25315
raised here following this conversation with Martijn .
Config: ES 5.4, AWS Linux, single node (test server), 3 shards, 0 replicas.
OS version (
uname -a
if on a Unix-like system):Linux 4.9.27-14.31.amzn1.x86_64 #1 SMP Wed May 10 01:58:40 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Plugins installed: [none]
JVM version (
java -version
): build 1.8.0_131-b11Description of the problem including expected versus actual behavior:
We have a set of documents which are replicated to Elastic from Couchbase.
in the document, we have a "holiday product", this product has a nested array of "prices", this price object contains fields such as number of passengers, date, promo code, price etc. For each holiday product, this list of prices can be very long as there can be hundreds/thousands of permutations.
When running a query against the data, we don't want to carry the 1000's of lines of data over the wire (can be >5MB) so are trying to use inner_hits to only return the rows that match. (approx 10KB)
we have a query such as:
this generates the error:
On this test server, we are running a single node with 3 shards. so not sure why one shard reports as failed.
if we change the
inner_hits": {}
toinner_hits": {"_source":false}
we get results but obviously don't get any useful information in the output!
trace log:
The text was updated successfully, but these errors were encountered: