Skip to content

more_like_this query to throw an error if the like fields is not provided #40632

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 18, 2019

Conversation

jimczi
Copy link
Contributor

@jimczi jimczi commented Mar 29, 2019

With the removal of the _all field the mlt query cannot infer a field name
to use to analyze the provided (un)like text if the fields parameter is not
explicitly set in the query and the index.query.default_field is not changed
in the index settings (by default it is set to *). For this reason the like text
is ignored and queries are only built from the provided document ids.
This change fixes this bug by throwing an error if the fields option is not set
and the index.query.default_field is equals to *. The error is thrown only
if like or unlike texts are provided in the query.

…ided

With the removal of the `_all` field the `mlt` query cannot infer a field name
to use to analyze the provided (un)like text if the `fields` parameter is not
explicitly set in the query and the `index.query.default_field` is not changed
in the index settings (by default it is set to `*`). For this reason the like text
is ignored and queries are only built from the provided document ids.
This change fixes this bug by throwing an error if the fields option is not set
and the `index.query.default_field` is equals to `*`. The error is thrown only
if like or unlike texts are provided in the query.
@jimczi jimczi added >bug :Search/Search Search-related issues that do not fall into other categories v8.0.0 v7.2.0 labels Mar 29, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

Copy link
Member

@cbuescher cbuescher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change and the implementation make sense to me. Left one question though, also some additional tests seem to need to be adapted as well.

if (moreLikeFields.size() == 1
&& moreLikeFields.get(0).equals("*")
&& (likeTexts.length > 0 || unlikeTexts.length > 0)) {
throw new IllegalArgumentException("[more_like_this] query cannot infer the field to analyze the free text, " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error seems to be breaking MoreLikeThisQueryBuilderTests#testToQuery and also occasionally #testMustRewrite. Probably something in the test setup needs to be changed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks good catch, I changed the test to not create invalid builders now that we have this restriction.

} finally {
// Reset the default value
context.getIndexSettings().updateIndexMetaData(
newIndexMeta("index",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just out of curiosity: why does the default need to get changed back? I was under the impression the context doesn't get reused? between different tests? If that is not the case, maybe the context should contain a copy of the index settings so changes to it don't affect other tests. This might be something to adress in another issue though if adressing it here is too involved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We update the settings for the created index that is shared among all tests in this class so we need to reset the value at the end. The query shard context is created for each test method but the index is static.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I think it would be good to change the test setup so this is something we don't have to worry about when writing tests. Probably worth a separate issue though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ok to share the same index on all tests. We can maybe restore the default value of the index settings after each test but it's very uncommon to change a setting in a query builder test so I don't think it's worth the change.

@jimczi
Copy link
Contributor Author

jimczi commented Apr 2, 2019

@elasticmachine run elasticsearch-ci/1

@jimczi
Copy link
Contributor Author

jimczi commented Apr 2, 2019

@elasticmachine run elasticsearch-ci/2

@jimczi
Copy link
Contributor Author

jimczi commented Apr 16, 2019

@cbuescher I pushed some changes to address your feedback, can you take another look ?

Copy link
Member

@cbuescher cbuescher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jimczi thanks for the update, LGTM

@jimczi jimczi merged commit 17a6ac1 into elastic:master Apr 18, 2019
@jimczi jimczi deleted the more_like_this_default_field branch April 18, 2019 20:29
jimczi added a commit that referenced this pull request Apr 18, 2019
…ided (#40632)

With the removal of the `_all` field the `mlt` query cannot infer a field name
to use to analyze the provided (un)like text if the `fields` parameter is not
explicitly set in the query and the `index.query.default_field` is not changed
in the index settings (by default it is set to `*`). For this reason the like text
is ignored and queries are only built from the provided document ids.
This change fixes this bug by throwing an error if the fields option is not set
and the `index.query.default_field` is equals to `*`. The error is thrown only
if like or unlike texts are provided in the query.
gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this pull request May 27, 2019
…ided (elastic#40632)

With the removal of the `_all` field the `mlt` query cannot infer a field name
to use to analyze the provided (un)like text if the `fields` parameter is not
explicitly set in the query and the `index.query.default_field` is not changed
in the index settings (by default it is set to `*`). For this reason the like text
is ignored and queries are only built from the provided document ids.
This change fixes this bug by throwing an error if the fields option is not set
and the `index.query.default_field` is equals to `*`. The error is thrown only
if like or unlike texts are provided in the query.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Search/Search Search-related issues that do not fall into other categories v7.2.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants