Skip to content

Adding max_analyzed_offset option for highlighting #1719

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Tomptez opened this issue Sep 13, 2021 · 1 comment
Closed

Adding max_analyzed_offset option for highlighting #1719

Tomptez opened this issue Sep 13, 2021 · 1 comment

Comments

@Tomptez
Copy link

Tomptez commented Sep 13, 2021

Describe the feature:

I have been using the highlighting when I came across this error message:
elasticsearch.exceptions.RequestError: RequestError(400, 'search_phase_execution_exception', 'The length of [myfield] field of [37] doc of [my-index] index has exceeded [1000000] - maximum allowed to be analyzed for highlighting. This maximum can be set by changing the [index.highlight.max_analyzed_offset] index level setting. For large texts, indexing with offsets or term vectors is recommended!')

As suggested in the documentation I tried to set max_analyzed_offset as an option, so the highlighting just stop at a certain point without returning an error message.

However, doing so has returned a different error message.
elasticsearch.exceptions.RequestError: RequestError(400, 'x_content_parse_exception', '[1:329] [highlight] unknown field [max_analyzed_offset]')

It seems like this merge has not been implemented in the python client. A fix for this would greatly be appreciated.

A suggestions for a temporary solution would be appreciated

@sethmlarson
Copy link
Contributor

There might be some confusion about "query" parameter language being used in the documentation you linked to. If I understand the confusion correctly, you think there should be a max_analyzed_offset parameter on the search API for example, but actually the parameter should be placed like so:

client.search(..., body={"query" : {"match" : {"field1" : "fox"}}, "highlight" : {"type" : "plain", "fields" : {"field1" : {}}, "max_analyzed_offset": 20}})

Does this make sense? I'm going to close this issue for now, will reopen if this it's not the solution to your issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants