-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Support for artificial documents #7530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This adds the ability to the Term Vector API to generate term vectors for artifical documents, that is for documents not present in the index. Following a similar syntax to the Percolator API, a new 'doc' parameter is used, instead of '_id', that specifies the document of interest. The parameters '_index' and '_type' determine the mapping and therefore analyzers to apply to each value field.
numbers have no meaning in this context. | ||
numbers have no meaning in this context. By default, when requesting | ||
term vectors of artificial documents, a shard to get the statistics from | ||
is randomly selected. Use `routing` only to hit a particular shard. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't the term vectors API currently return statistics that are aggregated across all shards? Documentation suggests so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope it does not.
"The term and field statistics are not accurate. Deleted documents are not taken into account. The information is only retrieved for the shard the requested document resides in."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
private String routing; | ||
|
||
protected String preference; | ||
|
||
private static AtomicInteger randomInt = new AtomicInteger(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make it final?
@alexksikes I left some comments |
@jpountz Thanks for comments. We should decide on allowing dynamic mappings or not, and if not what would be the easiest way to implement it? I'd be in favor of disabling dynamic mapping and only returning the TVs from the fields found in the original mapping. That because there is just too much room for mistakes and unintended behaviors. Maybe @clintongormley has some ideas? |
that is for documents not present in the index. The syntax is similar to the | ||
<<search-percolate,percolator>> API. For example, the following request would | ||
return the same results as in example 1. The mapping used is determined by the | ||
`index` and `type`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you leave a note about the fact that it can introduce new mappings?
I left some comments but I think it is close |
LGTM |
This adds the ability to the Term Vector API to generate term vectors for artifical documents, that is for documents not present in the index. Following a similar syntax to the Percolator API, a new 'doc' parameter is used, instead of '_id', that specifies the document of interest. The parameters '_index' and '_type' determine the mapping and therefore analyzers to apply to each value field. Closes #7530
This adds the ability to the Term Vector API to generate term vectors for artifical documents, that is for documents not present in the index. Following a similar syntax to the Percolator API, a new 'doc' parameter is used, instead of '_id', that specifies the document of interest. The parameters '_index' and '_type' determine the mapping and therefore analyzers to apply to each value field. Closes #7530
…m Vectors: Support for artificial documents #7530'
…m Vectors: Support for artificial documents elastic#7530'
This adds the ability to the Term Vector API to generate term vectors for
artifical documents, that is for documents not present in the index. Following
a similar syntax to the Percolator API, a new 'doc' parameter is used, instead
of '_id', that specifies the document of interest. The parameters '_index' and
'_type' determine the mapping and therefore analyzers to apply to each value
field.