Skip to content

Check for early termination in Driver #118188

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Jan 15, 2025

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented Dec 6, 2024

This change introduces support for periodically checking for early termination. This enables early exits in the following scenarios:

  1. The query has accumulated sufficient data (e.g., reaching the LIMIT).
  2. The query is stopped (either by users or due to failures).

Other changes will be addressed in follow-up PRs.

@dnhatn dnhatn requested review from nik9000 and smalyshev December 6, 2024 19:37
@dnhatn dnhatn added the auto-backport Automatically create backport pull requests when merged label Dec 6, 2024
@dnhatn dnhatn marked this pull request as ready for review December 6, 2024 19:38
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Dec 6, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@dnhatn dnhatn force-pushed the add-cancel-driver-context branch from 50b14e4 to 27d4a6e Compare December 6, 2024 19:41
Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we'd be better off having the Operator subclasses yield control - if they return null then the driver loop will check for them. But they have to store their intermediate state and that's annoying. This solution is more general though it's more "different" than the rest of the driver stuff works. Where are we looking at putting these?

@dnhatn dnhatn force-pushed the add-cancel-driver-context branch 2 times, most recently from 97bb696 to 376040c Compare January 9, 2025 02:21
@dnhatn dnhatn changed the title Allow check for completion or cancellation inside operator Allow check for early termination in Evaluator Jan 9, 2025
@dnhatn dnhatn force-pushed the add-cancel-driver-context branch 2 times, most recently from 71d42f1 to 6d21d28 Compare January 9, 2025 04:49
@dnhatn dnhatn force-pushed the add-cancel-driver-context branch from 930110c to 1a8f5b4 Compare January 9, 2025 06:35
@smalyshev
Copy link
Contributor

smalyshev commented Jan 9, 2025

Tested the newest version (b9deea7) locally and it seems to do the interruption mid-page properly.

@elasticsearchmachine
Copy link
Collaborator

Hi @dnhatn, I've created a changelog YAML for you.

@dnhatn dnhatn force-pushed the add-cancel-driver-context branch from 3d6d0a8 to 2d4bf70 Compare January 9, 2025 23:07
@dnhatn dnhatn force-pushed the add-cancel-driver-context branch from 2d4bf70 to b63bd27 Compare January 10, 2025 01:47
@dnhatn dnhatn changed the title Allow check for early termination in Evaluator Check for early termination in Evaluator Jan 10, 2025
@dnhatn dnhatn force-pushed the add-cancel-driver-context branch from 4256727 to fd8d5a6 Compare January 10, 2025 06:52
@@ -41,7 +41,7 @@ public static EvalOperator.ExpressionEvaluator.Factory toEvaluator(
return new AutomataMatchEvaluator.Factory(source, field, run, toDot(automaton));
}

@Evaluator
@Evaluator(executionCost = 50)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's just one example - we might not need this for all evaluators, only for the expensive ones whose inputs or outputs are BytesRef.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we almost certainly don't want it for most evaluators, but I can really see the use for bailing early for slow stuff - and, I think, all of those take or produce ByesRef. Automata match, geo stuff. Otherwise I think we're frequently dealing with more overhead than the operation. And we're quite likely to break autovectorization.

I wonder if instead of a "cost" this should be a, maybe simpler, "check for termination every n rows" style. p % <constant> == 0 is going to be pretty fast, especially if the constant is, like, 1 or 2 or 4 or 8. Hell, that feels like something loop unrolling might handle, though I don't think we're really worried about loop unrolling for these cases.

One thing that's really interesting - some of these automata matches are probably quite fast and not worth the checking. But we don't differentiate. And that's fine for now.

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we're really

@@ -41,7 +41,7 @@ public static EvalOperator.ExpressionEvaluator.Factory toEvaluator(
return new AutomataMatchEvaluator.Factory(source, field, run, toDot(automaton));
}

@Evaluator
@Evaluator(executionCost = 50)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we almost certainly don't want it for most evaluators, but I can really see the use for bailing early for slow stuff - and, I think, all of those take or produce ByesRef. Automata match, geo stuff. Otherwise I think we're frequently dealing with more overhead than the operation. And we're quite likely to break autovectorization.

I wonder if instead of a "cost" this should be a, maybe simpler, "check for termination every n rows" style. p % <constant> == 0 is going to be pretty fast, especially if the constant is, like, 1 or 2 or 4 or 8. Hell, that feels like something loop unrolling might handle, though I don't think we're really worried about loop unrolling for these cases.

One thing that's really interesting - some of these automata matches are probably quite fast and not worth the checking. But we don't differentiate. And that's fine for now.

@dnhatn dnhatn force-pushed the add-cancel-driver-context branch from 67a0e48 to cec1749 Compare January 15, 2025 17:16
@dnhatn dnhatn force-pushed the add-cancel-driver-context branch from cec1749 to c6815f4 Compare January 15, 2025 17:19
@dnhatn dnhatn changed the title Check for early termination in Evaluator Check for early termination in Driver Jan 15, 2025
@dnhatn
Copy link
Member Author

dnhatn commented Jan 15, 2025

@nik9000 Thanks for the feedback. I have a previous implementation that unrolls loops to maintain auto-vectorization (for example:

// generate a tight loop to allow vectorization
int maxBatchSize = Math.max(DriverContext.CHECK_FOR_EARLY_TERMINATION_COST_THRESHOLD / 1, 1);
for (int start = 0; start < positionCount; ) {
int end = start + Math.min(positionCount - start, maxBatchSize);
driverContext.checkForEarlyTermination();
for (int p = start; p < end; p++) {
result.appendLong(p, DateTrunc.processDatetime(fieldValVector.getLong(p), this.rounding));
}
start = end;
}
). I removed it to keep this PR small and will spin off the evaluator work to unblock Stas's progress.

@nik9000
Copy link
Member

nik9000 commented Jan 15, 2025

I removed it to keep this PR small and will spin off the evaluator work to unblock Stas's progress.

👍

@dnhatn
Copy link
Member Author

dnhatn commented Jan 15, 2025

@smalyshev @nik9000 Thanks for reviews + feedback!

@dnhatn dnhatn merged commit 1448f12 into elastic:main Jan 15, 2025
16 checks passed
@dnhatn dnhatn deleted the add-cancel-driver-context branch January 15, 2025 20:53
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Jan 15, 2025
This change introduces support for periodically checking for early
termination. This enables early exits in the following scenarios:

1. The query has accumulated sufficient data (e.g., reaching the LIMIT).

2. The query is stopped (either by users or due to failures).

Other changes will be addressed in follow-up PRs.
dnhatn added a commit that referenced this pull request Jan 15, 2025
This change introduces support for periodically checking for early
termination. This enables early exits in the following scenarios:

1. The query has accumulated sufficient data (e.g., reaching the LIMIT).

2. The query is stopped (either by users or due to failures).

Other changes will be addressed in follow-up PRs.
@dnhatn
Copy link
Member Author

dnhatn commented Jan 15, 2025

Backported to 8.18 in #120238

@elastic elastic deleted a comment from elasticsearchmachine Jan 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.18.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants