Skip to content

Skip cluster state serialization to closed channel #67413

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

DaveCTurner
Copy link
Contributor

Today if a client requests a cluster state and then closes the
connection then we still do all the work of computing and serializing
the cluster state before finally dropping it all on the floor.

With this commit we introduce checks to make sure that the HTTP channel
is still open before starting the serialization process. We also make
the tasks themselves cancellable and abort any ongoing waiting if the
channel is closed (mainly to make the cancellability testing easier).
Finally we introduce a more detailed description of the task to help
identify cases where clients are inefficiently requesting more
components of the cluster state than they need.

Today if a client requests a cluster state and then closes the
connection then we still do all the work of computing and serializing
the cluster state before finally dropping it all on the floor.

With this commit we introduce checks to make sure that the HTTP channel
is still open before starting the serialization process. We also make
the tasks themselves cancellable and abort any ongoing waiting if the
channel is closed (mainly to make the cancellability testing easier).
Finally we introduce a more detailed description of the task to help
identify cases where clients are inefficiently requesting more
components of the cluster state than they need.
@DaveCTurner DaveCTurner added >enhancement :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. v8.0.0 v7.12.0 labels Jan 13, 2021
@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Jan 13, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

Copy link
Member

@original-brownbear original-brownbear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question, will give the test another read otherwise but otherwise looks nice :)

@@ -138,13 +150,16 @@ public RestResponse buildResponse(final ClusterStateResponse response,
builder.field(Fields.CLUSTER_NAME, response.getClusterName().value());
ToXContent.Params params = new ToXContent.DelegatingMapParams(
singletonMap(Metadata.CONTEXT_MODE_PARAM, Metadata.CONTEXT_MODE_API), request);
response.getState().toXContent(builder, params);
final ClusterState responseState = response.getState();
if (responseState != null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only question pretty much on this one: why can this be null all of a sudden?

Copy link
Contributor Author

@DaveCTurner DaveCTurner Jan 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was always thus:

listener.onResponse(new ClusterStateResponse(state.getClusterName(), null, true));

TBF this only happens if it times out waiting for a particular metadata version, which in practice means the client is CCR and therefore not going via the REST layer anyway.

Copy link
Member

@original-brownbear original-brownbear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :)

@DaveCTurner DaveCTurner merged commit 31cf058 into elastic:master Jan 13, 2021
@DaveCTurner DaveCTurner deleted the 2021-01-13-cancel-cluster-state-requests branch January 13, 2021 13:24
DaveCTurner added a commit that referenced this pull request Jan 13, 2021
Today if a client requests a cluster state and then closes the
connection then we still do all the work of computing and serializing
the cluster state before finally dropping it all on the floor.

With this commit we introduce checks to make sure that the HTTP channel
is still open before starting the serialization process. We also make
the tasks themselves cancellable and abort any ongoing waiting if the
channel is closed (mainly to make the cancellability testing easier).
Finally we introduce a more detailed description of the task to help
identify cases where clients are inefficiently requesting more
components of the cluster state than they need.
DaveCTurner added a commit that referenced this pull request Jan 13, 2021
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Jan 13, 2021
Today if a client requests a cluster state and then closes the
connection then we still do all the work of computing and serializing
the cluster state before finally dropping it all on the floor.

With this commit we introduce checks to make sure that the HTTP channel
is still open before starting the serialization process. We also make
the tasks themselves cancellable and abort any ongoing waiting if the
channel is closed (mainly to make the cancellability testing easier).
Finally we introduce a more detailed description of the task to help
identify cases where clients are inefficiently requesting more
components of the cluster state than they need.

Backport of elastic#67413
DaveCTurner added a commit that referenced this pull request Jan 13, 2021
Today if a client requests a cluster state and then closes the
connection then we still do all the work of computing and serializing
the cluster state before finally dropping it all on the floor.

With this commit we introduce checks to make sure that the HTTP channel
is still open before starting the serialization process. We also make
the tasks themselves cancellable and abort any ongoing waiting if the
channel is closed (mainly to make the cancellability testing easier).
Finally we introduce a more detailed description of the task to help
identify cases where clients are inefficiently requesting more
components of the cluster state than they need.

Backport of #67413
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Feb 16, 2021
A small followup to elastic#67413 and elastic#68965: the underlying actions of the
`GET /_cat/segments` API are now cancellable, so we may as well cancel
them if needed.
DaveCTurner added a commit that referenced this pull request Feb 16, 2021
A small followup to #67413 and #68965: the underlying actions of the
`GET /_cat/segments` API are now cancellable, so we may as well cancel
them if needed.
DaveCTurner added a commit that referenced this pull request Feb 16, 2021
A small followup to #67413 and #68965: the underlying actions of the
`GET /_cat/segments` API are now cancellable, so we may as well cancel
them if needed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. >enhancement Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v7.12.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants