Skip to content

ES|QL Reranker command #123074

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 102 commits into from
Apr 4, 2025
Merged
Show file tree
Hide file tree
Changes from 66 commits
Commits
Show all changes
102 commits
Select commit Hold shift + click to select a range
612dbcc
Reranker grammar.
afoucret Feb 20, 2025
b2c0e11
Adding logical plan for the rerank command.
afoucret Feb 20, 2025
6818567
Rerank command logical plan parsing
afoucret Feb 20, 2025
fae22df
Rerank command analysis and verification.
afoucret Feb 21, 2025
4302a69
Merge branch 'main' of https://github.com/elastic/elasticsearch into …
afoucret Feb 21, 2025
7787efe
[CI] Auto commit changes from spotless
elasticsearchmachine Feb 21, 2025
9069561
Adding physical plan and operator for Rerank
afoucret Feb 21, 2025
89bdda4
Merge branch 'main' of https://github.com/elastic/elasticsearch into …
afoucret Feb 21, 2025
5c19b29
Inference execution.
afoucret Feb 21, 2025
b7d75a7
Merge branch 'main' of https://github.com/elastic/elasticsearch into …
afoucret Feb 21, 2025
025aaf9
Merge branch 'main' of https://github.com/elastic/elasticsearch into …
afoucret Mar 3, 2025
68426b9
Moving RerankOperator
afoucret Mar 3, 2025
6b1c04c
Make the rerank command SortAgnostic
afoucret Mar 4, 2025
85325d1
Merge branch 'main' of https://github.com/elastic/elasticsearch into …
afoucret Mar 7, 2025
763ccef
Remove useless exchange for reranker.
afoucret Mar 7, 2025
395db68
Remove trailing debug instructions.
afoucret Mar 7, 2025
e5532f9
Use a LinkedHashMap to preserver order of fields passed to the operator.
afoucret Mar 7, 2025
d311d1b
Refactoring
afoucret Mar 10, 2025
ec8c6f2
Basic CSV tests for the RERANK command.
afoucret Mar 11, 2025
d4ab3fd
Merge branch 'main' of https://github.com/elastic/elasticsearch into …
afoucret Mar 11, 2025
1e2b125
Merge branch 'main' of https://github.com/elastic/elasticsearch into …
afoucret Mar 11, 2025
cd9c076
[CI] Auto commit changes from spotless
elasticsearchmachine Mar 11, 2025
816993f
Fix missing import after merge.
afoucret Mar 11, 2025
5d45936
Checkstyle fix.
afoucret Mar 11, 2025
694902e
Add missing writeable for test reranger task settings to the TestInfe…
afoucret Mar 13, 2025
8e000a1
Fix a bug in the TestRerankingServiceExtension task settings serializ…
afoucret Mar 13, 2025
6322546
Better handling of block releases in the RerankOperator
afoucret Mar 13, 2025
e114548
Merge branch 'main' of https://github.com/elastic/elasticsearch into …
afoucret Mar 13, 2025
4b88842
Fix some failing tests.
afoucret Mar 13, 2025
52fc536
RerankOperator tests
afoucret Mar 14, 2025
84ff359
RerankOperator tests
afoucret Mar 17, 2025
d86f5b2
Merge branch 'main' of https://github.com/elastic/elasticsearch into …
afoucret Mar 17, 2025
d4d1d08
RERANK is not ready for CCQ yet
afoucret Mar 17, 2025
1fd0b14
Fixing some tests.
afoucret Mar 17, 2025
b1d9f4a
[CI] Auto commit changes from spotless
elasticsearchmachine Mar 17, 2025
ef688f7
Testing that _score is an existing column
afoucret Mar 17, 2025
b0129a3
RereankOperator refactoring
afoucret Mar 17, 2025
54ce1eb
Small test refactoring
afoucret Mar 18, 2025
ca6e972
Fix some tests.
afoucret Mar 18, 2025
bac4f9a
Fix an error in a copy/paste
afoucret Mar 18, 2025
4ef3f7e
Add missing import.
afoucret Mar 18, 2025
7e021f4
Add missing import.
afoucret Mar 18, 2025
78a7efb
[CI] Auto commit changes from spotless
elasticsearchmachine Mar 18, 2025
0e918df
Better handling of the YAML encoding.
afoucret Mar 19, 2025
5eadd6f
Merge branch 'main' of https://github.com/elastic/elasticsearch into …
afoucret Mar 19, 2025
3444c07
Renaming function in tests.
afoucret Mar 19, 2025
ca295fb
Renaming function in tests.
afoucret Mar 19, 2025
32dbb2c
Merge branch 'main' of https://github.com/elastic/elasticsearch into …
afoucret Mar 20, 2025
69d551f
Adding the _score column automatically if missing from previous step.
afoucret Mar 20, 2025
01c0529
Fix typo.
afoucret Mar 20, 2025
cd813fd
Add missing assertion in AnalyzerTests
afoucret Mar 20, 2025
84640cc
Continue refactoring of the RowEncoder.
afoucret Mar 20, 2025
587d947
Update docs/changelog/123074.yaml
afoucret Mar 20, 2025
83ba8fb
Rework inference resolution error.
afoucret Mar 20, 2025
2ce8ab1
Rewording.
afoucret Mar 21, 2025
ced6e5f
Improved changelog.
afoucret Mar 21, 2025
70f7ada
Improved XContentRowEncoder to support most element types correctly.
afoucret Mar 21, 2025
8e67202
Fix Javadoc.
afoucret Mar 21, 2025
c56a15e
Fix changelog
afoucret Mar 21, 2025
ee00286
Delete useless interface.
afoucret Mar 21, 2025
cffd345
Spotless fix.
afoucret Mar 21, 2025
d97701b
Ensure RERANK is done on the coordinator node.
afoucret Mar 21, 2025
53d1d2e
Merge branch 'main' of https://github.com/elastic/elasticsearch into …
afoucret Mar 21, 2025
05e0d45
[CI] Auto commit changes from spotless
elasticsearchmachine Mar 21, 2025
dccaf4a
Merge branch 'main' of https://github.com/elastic/elasticsearch into …
afoucret Mar 25, 2025
4cdfdfa
Regen grammar after merge.
afoucret Mar 25, 2025
7ff6b27
Update docs/changelog/123074.yaml
afoucret Mar 26, 2025
279291b
Using enum string representation for serioalization.
afoucret Mar 26, 2025
0f10c3f
Merge branch 'main' of https://github.com/elastic/elasticsearch into …
afoucret Mar 26, 2025
04c7516
[CI] Auto commit changes from spotless
elasticsearchmachine Mar 26, 2025
5ab46d9
Fix bug introduced during merge.
afoucret Mar 26, 2025
4e17324
Reorder the Lexer tokens.
afoucret Mar 26, 2025
c998b38
Fix StatementParserTests for reranking.
afoucret Mar 26, 2025
62e42a5
[CI] Auto commit changes from spotless
elasticsearchmachine Mar 26, 2025
a151971
Improved grammar for the RERANK command.
afoucret Mar 26, 2025
f5c6a4f
Merge branch 'main' of https://github.com/elastic/elasticsearch into …
afoucret Mar 26, 2025
3de79f1
Merge branch 'main' of https://github.com/elastic/elasticsearch into …
afoucret Mar 28, 2025
4099079
Fix typo.
afoucret Mar 31, 2025
1a6161b
Adding additional CSV test cases.
afoucret Mar 31, 2025
983ce28
Adding more tests.
afoucret Mar 31, 2025
6627bb1
Merge branch 'main' of https://github.com/elastic/elasticsearch into …
afoucret Mar 31, 2025
8150a1f
[CI] Auto commit changes from spotless
elasticsearchmachine Mar 31, 2025
79d5581
Merge branch 'main' of https://github.com/elastic/elasticsearch into …
afoucret Apr 1, 2025
ea134a7
Muting a flaky test.
afoucret Apr 1, 2025
ef82268
Not using an OriginSettingsClient anymore in the InferenceService
afoucret Apr 1, 2025
ce38aa7
Merge branch 'main' of https://github.com/elastic/elasticsearch into …
afoucret Apr 1, 2025
132825d
InferenceService refactoring (now InferenceRunner)
afoucret Apr 1, 2025
9dee9a5
Merge branch 'main' of https://github.com/elastic/elasticsearch into …
afoucret Apr 2, 2025
d26cc76
Rerank is not a pipeline breaker anymore.
afoucret Apr 2, 2025
9e47052
Merge branch 'main' of https://github.com/elastic/elasticsearch into …
afoucret Apr 3, 2025
109a3dc
Merge branch 'main' of https://github.com/elastic/elasticsearch into …
afoucret Apr 3, 2025
39fe811
Merge branch 'main' into esql-reranker-boostrap
afoucret Apr 3, 2025
d8fe1db
Merge branch 'main' of https://github.com/elastic/elasticsearch into …
afoucret Apr 3, 2025
37e1d1c
Use double in tests.
afoucret Apr 4, 2025
4cad906
Use rounded scores in tests.
afoucret Apr 4, 2025
cf87127
Merge branch 'main' of https://github.com/elastic/elasticsearch into …
afoucret Apr 4, 2025
45ebc85
Merge branch 'main' into esql-reranker-boostrap
afoucret Apr 4, 2025
a166961
Fix serverless build.
afoucret Apr 4, 2025
e084062
Use Rest to check capability instead of constant.
afoucret Apr 4, 2025
63d83ec
Merge branch 'main' of https://github.com/elastic/elasticsearch into …
afoucret Apr 4, 2025
65aac16
Temporary deleting IT cause they are failing on serverless.
afoucret Apr 4, 2025
ef0e5c1
Add missing capability in the CSV test
afoucret Apr 4, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions docs/changelog/123074.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
pr: 123074
summary: Adding ES|QL Reranker command in snapshot builds
area: Ranking
type: feature
issues: [124337]

Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,10 @@ public static class Request extends BaseInferenceActionRequest {
public static final ParseField QUERY = new ParseField("query");
public static final ParseField TIMEOUT = new ParseField("timeout");

public static Builder builder(String inferenceEntityId, TaskType taskType) {
return new Builder().setInferenceEntityId(inferenceEntityId).setTaskType(taskType);
}

static final ObjectParser<Request.Builder, Void> PARSER = new ObjectParser<>(NAME, Request.Builder::new);
static {
PARSER.declareStringArray(Request.Builder::setInput, INPUT);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@
import static org.elasticsearch.xpack.esql.action.EsqlCapabilities.Cap.JOIN_LOOKUP_V12;
import static org.elasticsearch.xpack.esql.action.EsqlCapabilities.Cap.JOIN_PLANNING_V1;
import static org.elasticsearch.xpack.esql.action.EsqlCapabilities.Cap.METADATA_FIELDS_REMOTE_TEST;
import static org.elasticsearch.xpack.esql.action.EsqlCapabilities.Cap.RERANK;
import static org.elasticsearch.xpack.esql.action.EsqlCapabilities.Cap.UNMAPPED_FIELDS;
import static org.elasticsearch.xpack.esql.qa.rest.EsqlSpecTestCase.Mode.SYNC;
import static org.mockito.ArgumentMatchers.any;
Expand Down Expand Up @@ -130,6 +131,8 @@ protected void shouldSkipTest(String testName) throws IOException {
assumeFalse("LOOKUP JOIN not yet supported in CCS", testCase.requiredCapabilities.contains(JOIN_LOOKUP_V12.capabilityName()));
// Unmapped fields require a coorect capability response from every cluster, which isn't currently implemented.
assumeFalse("UNMAPPED FIELDS not yet supported in CCS", testCase.requiredCapabilities.contains(UNMAPPED_FIELDS.capabilityName()));
// Need to do additional developmnet to get CSS support for the rerank coammnd
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ℹ️ CSS support is tracked as a follow-up in the meta issue (#124337)

assumeFalse("RERANK not yet supported in CCS", testCase.requiredCapabilities.contains(RERANK.capabilityName()));
}

private TestFeatureService remoteFeaturesService() throws IOException {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@
import static org.elasticsearch.xpack.esql.CsvTestUtils.loadCsvSpecValues;
import static org.elasticsearch.xpack.esql.CsvTestsDataLoader.availableDatasetsForEs;
import static org.elasticsearch.xpack.esql.CsvTestsDataLoader.clusterHasInferenceEndpoint;
import static org.elasticsearch.xpack.esql.CsvTestsDataLoader.createInferenceEndpoint;
import static org.elasticsearch.xpack.esql.CsvTestsDataLoader.createInferenceEndpoints;
import static org.elasticsearch.xpack.esql.CsvTestsDataLoader.deleteInferenceEndpoint;
import static org.elasticsearch.xpack.esql.CsvTestsDataLoader.loadDataSetIntoEs;
import static org.elasticsearch.xpack.esql.EsqlTestUtils.classpathResources;
Expand Down Expand Up @@ -130,7 +130,7 @@ protected EsqlSpecTestCase(
@Before
public void setup() throws IOException {
if (supportsInferenceTestService() && clusterHasInferenceEndpoint(client()) == false) {
createInferenceEndpoint(client());
createInferenceEndpoints(client());
}

boolean supportsLookup = supportsIndexModeLookup();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -357,7 +357,7 @@ private static void loadDataSetIntoEs(
}

/** The semantic_text mapping type require an inference endpoint that needs to be setup before creating the index. */
public static void createInferenceEndpoint(RestClient client) throws IOException {
public static void createInferenceEndpoints(RestClient client) throws IOException {
Request request = new Request("PUT", "_inference/sparse_embedding/test_sparse_inference");
request.setJsonEntity("""
{
Expand All @@ -371,6 +371,21 @@ public static void createInferenceEndpoint(RestClient client) throws IOException
}
""");
client.performRequest(request);

request = new Request("PUT", "_inference/rerank/test_reranker");
request.setJsonEntity("""
{
"service": "test_reranking_service",
"service_settings": {
"model_id": "my_model",
"api_key": "abc64"
},
"task_settings": {
"use_text_length": true
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ℹ️ Using the settings use_text_length: true, so the reranker score is predictable in test instances (equal to the inverse of the length of the text sent).

}
}
""");
client.performRequest(request);
}

public static void deleteInferenceEndpoint(RestClient client) throws IOException {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
import org.apache.lucene.sandbox.document.HalfFloatPoint;
import org.apache.lucene.util.BytesRef;
import org.elasticsearch.ExceptionsHelper;
import org.elasticsearch.action.ActionListener;
import org.elasticsearch.cluster.metadata.IndexNameExpressionResolver;
import org.elasticsearch.cluster.service.ClusterService;
import org.elasticsearch.common.Strings;
Expand Down Expand Up @@ -67,6 +68,8 @@
import org.elasticsearch.xpack.esql.expression.predicate.operator.comparison.LessThanOrEqual;
import org.elasticsearch.xpack.esql.expression.predicate.operator.comparison.NotEquals;
import org.elasticsearch.xpack.esql.index.EsIndex;
import org.elasticsearch.xpack.esql.inference.InferenceResolution;
import org.elasticsearch.xpack.esql.inference.InferenceService;
import org.elasticsearch.xpack.esql.optimizer.LogicalOptimizerContext;
import org.elasticsearch.xpack.esql.parser.QueryParam;
import org.elasticsearch.xpack.esql.plan.logical.Enrich;
Expand Down Expand Up @@ -148,6 +151,8 @@
import static org.hamcrest.Matchers.instanceOf;
import static org.junit.Assert.assertNotNull;
import static org.junit.Assert.assertNull;
import static org.mockito.ArgumentMatchers.any;
import static org.mockito.Mockito.doAnswer;
import static org.mockito.Mockito.mock;

public final class EsqlTestUtils {
Expand Down Expand Up @@ -375,9 +380,21 @@ public static LogicalOptimizerContext unboundLogicalOptimizerContext() {
null,
mock(ClusterService.class),
mock(IndexNameExpressionResolver.class),
null
null,
mockInferenceService()
);

@SuppressWarnings("unchecked")
private static InferenceService mockInferenceService() {
InferenceService inferenceService = mock(InferenceService.class);
doAnswer(i -> {
i.getArgument(1, ActionListener.class).onResponse(emptyInferenceResolution());
return null;
}).when(inferenceService).resolveInferences(any(), any());

return inferenceService;
}

private EsqlTestUtils() {}

public static Configuration configuration(QueryPragmas pragmas, String query) {
Expand Down Expand Up @@ -453,6 +470,10 @@ public static EnrichResolution emptyPolicyResolution() {
return new EnrichResolution();
}

public static InferenceResolution emptyInferenceResolution() {
return InferenceResolution.EMPTY;
}

public static SearchStats statsForExistingField(String... names) {
return fieldMatchingExistOrMissing(true, names);
}
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
// Note:
// The "test_reranker" service scores the row from the inputText length and does not really score by relevance.
// This makes the output more predictable which is helpful here.


reranker using a single field
required_capability: rerank
required_capability: match_function

FROM books METADATA _score
| WHERE title:"war and peace" AND author:"Tolstoy"
| RERANK "war and peace" ON title WITH "test_reranker"
| KEEP book_no, title, author, _score
;

book_no:keyword | title:text | author:text | _score:double
5327 | War and Peace | Leo Tolstoy | 0.03846153989434242
4536 | War and Peace (Signet Classics) | [John Hockenberry, Leo Tolstoy, Pat Conroy] | 0.02222222276031971
9032 | War and Peace: A Novel (6 Volumes) | Tolstoy Leo | 0.02083333395421505
2776 | The Devil and Other Stories (Oxford World's Classics) | Leo Tolstoy | 0.01515151560306549
;


reranker using multiple fields
required_capability: rerank
required_capability: match_function

FROM books METADATA _score
| WHERE title:"war and peace" AND author:"Tolstoy"
| RERANK "war and peace" ON title, author WITH "test_reranker"
| KEEP book_no, title, author, _score
;

book_no:keyword | title:text | author:text | _score:double
5327 | War and Peace | Leo Tolstoy | 0.02083333395421505
9032 | War and Peace: A Novel (6 Volumes) | Tolstoy Leo | 0.014285714365541935
2776 | The Devil and Other Stories (Oxford World's Classics) | Leo Tolstoy | 0.011363636702299118
4536 | War and Peace (Signet Classics) | [John Hockenberry, Leo Tolstoy, Pat Conroy] | 0.009523809887468815
;


reranker after a limit
required_capability: rerank
required_capability: match_function

FROM books METADATA _score
| WHERE title:"war and peace" AND author:"Tolstoy"
| SORT _score DESC
| LIMIT 3
| RERANK "war and peace" ON title WITH "test_reranker"
| KEEP book_no, title, author, _score
;

book_no:keyword | title:text | author:text | _score:double
5327 | War and Peace | Leo Tolstoy | 0.03846153989434242
4536 | War and Peace (Signet Classics) | [John Hockenberry, Leo Tolstoy, Pat Conroy] | 0.02222222276031971
9032 | War and Peace: A Novel (6 Volumes) | Tolstoy Leo | 0.02083333395421505
;


reranker before a limit
required_capability: rerank
required_capability: match_function

FROM books METADATA _score
| WHERE title:"war and peace" AND author:"Tolstoy"
| RERANK "war and peace" ON title WITH "test_reranker"
| KEEP book_no, title, author, _score
| LIMIT 3
;

book_no:keyword | title:text | author:text | _score:double
5327 | War and Peace | Leo Tolstoy | 0.03846153989434242
4536 | War and Peace (Signet Classics) | [John Hockenberry, Leo Tolstoy, Pat Conroy] | 0.02222222276031971
9032 | War and Peace: A Novel (6 Volumes) | Tolstoy Leo | 0.02083333395421505
;


reranker add the _score column when missing
required_capability: rerank
required_capability: match_function

FROM books
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ℹ️ Added a case where the _score fiels is missing and check if it is added to the result

| WHERE title:"war and peace" AND author:"Tolstoy"
| RERANK "war and peace" ON title WITH "test_reranker"
| KEEP book_no, title, author, _score
;


book_no:keyword | title:text | author:text | _score:double
5327 | War and Peace | Leo Tolstoy | 0.03846153989434242
4536 | War and Peace (Signet Classics) | [John Hockenberry, Leo Tolstoy, Pat Conroy] | 0.02222222276031971
9032 | War and Peace: A Novel (6 Volumes) | Tolstoy Leo | 0.02083333395421505
2776 | The Devil and Other Stories (Oxford World's Classics) | Leo Tolstoy | 0.01515151560306549
;
Loading
Loading