Skip to content

Add support for extracting ranges from percolator queries #19191

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion docs/reference/mapping/types/percolator.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -71,11 +71,14 @@ a percolator query does not exist, it will be handled as a default string field
fail.

[float]
==== Important Notes
==== Limitations

Because the `percolate` query is processing one document at a time, it doesn't support queries and filters that run
against child documents such as `has_child` and `has_parent`.

The percolator doesn't accepts percolator queries containing `range` queries with ranges that are based on current
time (using `now`).

There are a number of queries that fetch data via a get call during query parsing. For example the `terms` query when
using terms lookup, `template` query when using indexed scripts and `geo_shape` when using pre-indexed shapes. When these
queries are indexed by the `percolator` field type then the get call is executed once. So each time the `percolator`
Expand Down
5 changes: 5 additions & 0 deletions docs/reference/migration/migrate_5_0/percolator.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,11 @@ the existing document.

The percolate stats have been removed. This is because the percolator no longer caches the percolator queries.

==== Percolator queries containing range queries with now ranges

The percolator no longer accepts percolator queries containing `range` queries with ranges that are based on current
time (using `now`).

==== Java client

The percolator is no longer part of the core elasticsearch dependency. It has moved to the percolator module.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
import org.elasticsearch.action.Action;
import org.elasticsearch.client.ElasticsearchClient;

@Deprecated
public class MultiPercolateAction extends Action<MultiPercolateRequest, MultiPercolateResponse, MultiPercolateRequestBuilder> {

public static final MultiPercolateAction INSTANCE = new MultiPercolateAction();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
import org.elasticsearch.action.Action;
import org.elasticsearch.client.ElasticsearchClient;

@Deprecated
public class PercolateAction extends Action<PercolateRequest, PercolateResponse, PercolateRequestBuilder> {

public static final PercolateAction INSTANCE = new PercolateAction();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,138 +22,60 @@
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.LeafReaderContext;
import org.apache.lucene.index.Term;
import org.apache.lucene.search.BooleanQuery;
import org.apache.lucene.search.DocIdSetIterator;
import org.apache.lucene.search.Explanation;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.Scorer;
import org.apache.lucene.search.TermQuery;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.search.TwoPhaseIterator;
import org.apache.lucene.search.Weight;
import org.apache.lucene.util.Accountable;
import org.apache.lucene.util.Bits;
import org.elasticsearch.common.bytes.BytesReference;
import org.elasticsearch.common.lucene.Lucene;
import org.elasticsearch.common.lucene.search.MatchNoDocsQuery;

import java.io.IOException;
import java.util.Objects;
import java.util.Set;

import static org.apache.lucene.search.BooleanClause.Occur.FILTER;

public final class PercolateQuery extends Query implements Accountable {
final class PercolateQuery extends Query implements Accountable {

// cost of matching the query against the document, arbitrary as it would be really complex to estimate
public static final float MATCH_COST = 1000;

public static class Builder {

private final String docType;
private final QueryStore queryStore;
private final BytesReference documentSource;
private final IndexSearcher percolatorIndexSearcher;

private Query queriesMetaDataQuery;
private Query verifiedQueriesQuery = new MatchNoDocsQuery("");
private Query percolateTypeQuery;

/**
* @param docType The type of the document being percolated
* @param queryStore The lookup holding all the percolator queries as Lucene queries.
* @param documentSource The source of the document being percolated
* @param percolatorIndexSearcher The index searcher on top of the in-memory index that holds the document being percolated
*/
public Builder(String docType, QueryStore queryStore, BytesReference documentSource, IndexSearcher percolatorIndexSearcher) {
this.docType = Objects.requireNonNull(docType);
this.queryStore = Objects.requireNonNull(queryStore);
this.documentSource = Objects.requireNonNull(documentSource);
this.percolatorIndexSearcher = Objects.requireNonNull(percolatorIndexSearcher);
}

/**
* Optionally sets a query that reduces the number of queries to percolate based on extracted terms from
* the document to be percolated.
* @param extractedTermsFieldName The name of the field to get the extracted terms from
* @param extractionResultField The field to indicate for a document whether query term extraction was complete,
* partial or failed. If query extraction was complete, the MemoryIndex doesn't
*/
public void extractQueryTermsQuery(String extractedTermsFieldName, String extractionResultField) throws IOException {
// We can only skip the MemoryIndex verification when percolating a single document.
// When the document being percolated contains a nested object field then the MemoryIndex contains multiple
// documents. In this case the term query that indicates whether memory index verification can be skipped
// can incorrectly indicate that non nested queries would match, while their nested variants would not.
if (percolatorIndexSearcher.getIndexReader().maxDoc() == 1) {
this.verifiedQueriesQuery = new TermQuery(new Term(extractionResultField, ExtractQueryTermsService.EXTRACTION_COMPLETE));
}
this.queriesMetaDataQuery = ExtractQueryTermsService.createQueryTermsQuery(
percolatorIndexSearcher.getIndexReader(), extractedTermsFieldName,
// include extractionResultField:failed, because docs with this term have no extractedTermsField
// and otherwise we would fail to return these docs. Docs that failed query term extraction
// always need to be verified by MemoryIndex:
new Term(extractionResultField, ExtractQueryTermsService.EXTRACTION_FAILED)
);
}

/**
* @param percolateTypeQuery A query that identifies all document containing percolator queries
*/
public void setPercolateTypeQuery(Query percolateTypeQuery) {
this.percolateTypeQuery = Objects.requireNonNull(percolateTypeQuery);
}

public PercolateQuery build() {
if (percolateTypeQuery != null && queriesMetaDataQuery != null) {
throw new IllegalStateException("Either filter by deprecated percolator type or by query metadata");
}
// The query that selects which percolator queries will be evaluated by MemoryIndex:
BooleanQuery.Builder queriesQuery = new BooleanQuery.Builder();
if (percolateTypeQuery != null) {
queriesQuery.add(percolateTypeQuery, FILTER);
}
if (queriesMetaDataQuery != null) {
queriesQuery.add(queriesMetaDataQuery, FILTER);
}
return new PercolateQuery(docType, queryStore, documentSource, queriesQuery.build(), percolatorIndexSearcher,
verifiedQueriesQuery);
}

}

private final String documentType;
private final QueryStore queryStore;
private final BytesReference documentSource;
private final Query percolatorQueriesQuery;
private final Query verifiedQueriesQuery;
private final Query candidateMatchesQuery;
private final Query verifiedMatchesQuery;
private final IndexSearcher percolatorIndexSearcher;

private PercolateQuery(String documentType, QueryStore queryStore, BytesReference documentSource,
Query percolatorQueriesQuery, IndexSearcher percolatorIndexSearcher, Query verifiedQueriesQuery) {
this.documentType = documentType;
this.documentSource = documentSource;
this.percolatorQueriesQuery = percolatorQueriesQuery;
this.queryStore = queryStore;
this.percolatorIndexSearcher = percolatorIndexSearcher;
this.verifiedQueriesQuery = verifiedQueriesQuery;
PercolateQuery(String documentType, QueryStore queryStore, BytesReference documentSource,
Query candidateMatchesQuery, IndexSearcher percolatorIndexSearcher, Query verifiedMatchesQuery) {
this.documentType = Objects.requireNonNull(documentType);
this.documentSource = Objects.requireNonNull(documentSource);
this.candidateMatchesQuery = Objects.requireNonNull(candidateMatchesQuery);
this.queryStore = Objects.requireNonNull(queryStore);
this.percolatorIndexSearcher = Objects.requireNonNull(percolatorIndexSearcher);
this.verifiedMatchesQuery = Objects.requireNonNull(verifiedMatchesQuery);
}

@Override
public Query rewrite(IndexReader reader) throws IOException {
Query rewritten = percolatorQueriesQuery.rewrite(reader);
if (rewritten != percolatorQueriesQuery) {
Query rewritten = candidateMatchesQuery.rewrite(reader);
if (rewritten != candidateMatchesQuery) {
return new PercolateQuery(documentType, queryStore, documentSource, rewritten, percolatorIndexSearcher,
verifiedQueriesQuery);
verifiedMatchesQuery);
} else {
return this;
}
}

@Override
public Weight createWeight(IndexSearcher searcher, boolean needsScores) throws IOException {
final Weight verifiedQueriesQueryWeight = verifiedQueriesQuery.createWeight(searcher, false);
final Weight innerWeight = percolatorQueriesQuery.createWeight(searcher, needsScores);
final Weight verifiedMatchesWeight = verifiedMatchesQuery.createWeight(searcher, false);
final Weight candidateMatchesWeight = candidateMatchesQuery.createWeight(searcher, false);
return new Weight(this) {
@Override
public void extractTerms(Set<Term> set) {
Expand Down Expand Up @@ -183,17 +105,17 @@ public Explanation explain(LeafReaderContext leafReaderContext, int docId) throw

@Override
public float getValueForNormalization() throws IOException {
return innerWeight.getValueForNormalization();
return candidateMatchesWeight.getValueForNormalization();
}

@Override
public void normalize(float v, float v1) {
innerWeight.normalize(v, v1);
candidateMatchesWeight.normalize(v, v1);
}

@Override
public Scorer scorer(LeafReaderContext leafReaderContext) throws IOException {
final Scorer approximation = innerWeight.scorer(leafReaderContext);
final Scorer approximation = candidateMatchesWeight.scorer(leafReaderContext);
if (approximation == null) {
return null;
}
Expand Down Expand Up @@ -226,7 +148,7 @@ public float score() throws IOException {
}
};
} else {
Scorer verifiedDocsScorer = verifiedQueriesQueryWeight.scorer(leafReaderContext);
Scorer verifiedDocsScorer = verifiedMatchesWeight.scorer(leafReaderContext);
Bits verifiedDocsBits = Lucene.asSequentialAccessBits(leafReaderContext.reader().maxDoc(), verifiedDocsScorer);
return new BaseScorer(this, approximation, queries, percolatorIndexSearcher) {

Expand Down Expand Up @@ -293,7 +215,7 @@ public int hashCode() {
@Override
public String toString(String s) {
return "PercolateQuery{document_type={" + documentType + "},document_source={" + documentSource.utf8ToString() +
"},inner={" + percolatorQueriesQuery.toString(s) + "}}";
"},inner={" + candidateMatchesQuery.toString(s) + "}}";
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -50,14 +50,14 @@
import org.elasticsearch.common.bytes.BytesReference;
import org.elasticsearch.common.io.stream.StreamInput;
import org.elasticsearch.common.io.stream.StreamOutput;
import org.elasticsearch.common.lucene.search.MatchNoDocsQuery;
import org.elasticsearch.common.lucene.search.Queries;
import org.elasticsearch.common.xcontent.XContent;
import org.elasticsearch.common.xcontent.XContentBuilder;
import org.elasticsearch.common.xcontent.XContentFactory;
import org.elasticsearch.common.xcontent.XContentHelper;
import org.elasticsearch.common.xcontent.XContentParser;
import org.elasticsearch.common.xcontent.XContentType;
import org.elasticsearch.index.IndexSettings;
import org.elasticsearch.index.analysis.FieldNameAnalyzer;
import org.elasticsearch.index.mapper.DocumentMapper;
import org.elasticsearch.index.mapper.DocumentMapperForType;
Expand Down Expand Up @@ -406,37 +406,27 @@ protected Analyzer getWrappedAnalyzer(String fieldName) {
docSearcher.setQueryCache(null);
}

IndexSettings indexSettings = context.getIndexSettings();
boolean mapUnmappedFieldsAsString = indexSettings.getValue(PercolatorFieldMapper.INDEX_MAP_UNMAPPED_FIELDS_AS_STRING_SETTING);
return buildQuery(indexSettings.getIndexVersionCreated(), context, docSearcher, mapUnmappedFieldsAsString);
}

Query buildQuery(Version indexVersionCreated, QueryShardContext context, IndexSearcher docSearcher,
boolean mapUnmappedFieldsAsString) throws IOException {
Version indexVersionCreated = context.getIndexSettings().getIndexVersionCreated();
boolean mapUnmappedFieldsAsString = context.getIndexSettings()
.getValue(PercolatorFieldMapper.INDEX_MAP_UNMAPPED_FIELDS_AS_STRING_SETTING);
if (indexVersionCreated.onOrAfter(Version.V_5_0_0_alpha1)) {
MappedFieldType fieldType = context.fieldMapper(field);
if (fieldType == null) {
throw new QueryShardException(context, "field [" + field + "] does not exist");
}

if (!(fieldType instanceof PercolatorFieldMapper.PercolatorFieldType)) {
if (!(fieldType instanceof PercolatorFieldMapper.FieldType)) {
throw new QueryShardException(context, "expected field [" + field +
"] to be of type [percolator], but is of type [" + fieldType.typeName() + "]");
}
PercolatorFieldMapper.PercolatorFieldType pft = (PercolatorFieldMapper.PercolatorFieldType) fieldType;
PercolatorFieldMapper.FieldType pft = (PercolatorFieldMapper.FieldType) fieldType;
PercolateQuery.QueryStore queryStore = createStore(pft, context, mapUnmappedFieldsAsString);
PercolateQuery.Builder builder = new PercolateQuery.Builder(
documentType, queryStore, document, docSearcher
);
builder.extractQueryTermsQuery(pft.getExtractedTermsField(), pft.getExtractionResultFieldName());
return builder.build();
return pft.percolateQuery(documentType, queryStore, document, docSearcher, docMapper);
} else {
Query percolateTypeQuery = new TermQuery(new Term(TypeFieldMapper.NAME, MapperService.PERCOLATOR_LEGACY_TYPE_NAME));
PercolateQuery.Builder builder = new PercolateQuery.Builder(
documentType, createLegacyStore(context, mapUnmappedFieldsAsString), document, docSearcher
);
builder.setPercolateTypeQuery(percolateTypeQuery);
return builder.build();
PercolateQuery.QueryStore queryStore = createLegacyStore(context, mapUnmappedFieldsAsString);
return new PercolateQuery(documentType, queryStore, document, percolateTypeQuery, docSearcher,
new MatchNoDocsQuery("pre 5.0.0-alpha1 index, no verified matches"));
}
}

Expand Down Expand Up @@ -477,17 +467,17 @@ public Weight createNormalizedWeight(Query query, boolean needsScores) throws IO
}
}

private static PercolateQuery.QueryStore createStore(PercolatorFieldMapper.PercolatorFieldType fieldType,
private static PercolateQuery.QueryStore createStore(PercolatorFieldMapper.FieldType fieldType,
QueryShardContext context,
boolean mapUnmappedFieldsAsString) {
return ctx -> {
LeafReader leafReader = ctx.reader();
BinaryDocValues binaryDocValues = leafReader.getBinaryDocValues(fieldType.getQueryBuilderFieldName());
BinaryDocValues binaryDocValues = leafReader.getBinaryDocValues(fieldType.queryBuilderField.name());
if (binaryDocValues == null) {
return docId -> null;
}

Bits bits = leafReader.getDocsWithField(fieldType.getQueryBuilderFieldName());
Bits bits = leafReader.getDocsWithField(fieldType.queryBuilderField.name());
return docId -> {
if (bits.get(docId)) {
BytesRef qbSource = binaryDocValues.get(docId);
Expand Down
Loading