Skip to content

Cheaper snapshot-related toString() impls #121283

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

DaveCTurner
Copy link
Contributor

If the MasterService needs to log a create-snapshot task description
then it will call CreateSnapshotTask#toString, which today calls
RepositoryData#toString which is not overridden so ends up calling
RepositoryData#hashCode. This can be extraordinarily expensive in a
large repository. Worse, if there's masses of create-snapshot tasks to
execute then it'll do this repeatedly, because each one only ends up
yielding a short hex string so we don't reach the description length
limit very easily.

With this commit we provide a more efficient implementation of
CreateSnapshotTask#toString and also override
RepositoryData#toString to protect against some other caller running
into the same issue.

If the `MasterService` needs to log a create-snapshot task description
then it will call `CreateSnapshotTask#toString`, which today calls
`RepositoryData#toString` which is not overridden so ends up calling
`RepositoryData#hashCode`. This can be extraordinarily expensive in a
large repository. Worse, if there's masses of create-snapshot tasks to
execute then it'll do this repeatedly, because each one only ends up
yielding a short hex string so we don't reach the description length
limit very easily.

With this commit we provide a more efficient implementation of
`CreateSnapshotTask#toString` and also override
`RepositoryData#toString` to protect against some other caller running
into the same issue.
@elasticsearchmachine elasticsearchmachine added the Team:Distributed Coordination Meta label for Distributed Coordination team label Jan 30, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

@elasticsearchmachine
Copy link
Collaborator

Hi @DaveCTurner, I've created a changelog YAML for you.

@DaveCTurner DaveCTurner added auto-backport Automatically create backport pull requests when merged auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) labels Jan 30, 2025
@elasticsearchmachine elasticsearchmachine merged commit aac1409 into elastic:main Jan 30, 2025
17 checks passed
@DaveCTurner DaveCTurner deleted the 2025/01/30/cheaper-snapshot-tostrings branch January 30, 2025 16:01
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Jan 30, 2025
If the `MasterService` needs to log a create-snapshot task description
then it will call `CreateSnapshotTask#toString`, which today calls
`RepositoryData#toString` which is not overridden so ends up calling
`RepositoryData#hashCode`. This can be extraordinarily expensive in a
large repository. Worse, if there's masses of create-snapshot tasks to
execute then it'll do this repeatedly, because each one only ends up
yielding a short hex string so we don't reach the description length
limit very easily.

With this commit we provide a more efficient implementation of
`CreateSnapshotTask#toString` and also override
`RepositoryData#toString` to protect against some other caller running
into the same issue.
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Jan 30, 2025
If the `MasterService` needs to log a create-snapshot task description
then it will call `CreateSnapshotTask#toString`, which today calls
`RepositoryData#toString` which is not overridden so ends up calling
`RepositoryData#hashCode`. This can be extraordinarily expensive in a
large repository. Worse, if there's masses of create-snapshot tasks to
execute then it'll do this repeatedly, because each one only ends up
yielding a short hex string so we don't reach the description length
limit very easily.

With this commit we provide a more efficient implementation of
`CreateSnapshotTask#toString` and also override
`RepositoryData#toString` to protect against some other caller running
into the same issue.
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Jan 30, 2025
If the `MasterService` needs to log a create-snapshot task description
then it will call `CreateSnapshotTask#toString`, which today calls
`RepositoryData#toString` which is not overridden so ends up calling
`RepositoryData#hashCode`. This can be extraordinarily expensive in a
large repository. Worse, if there's masses of create-snapshot tasks to
execute then it'll do this repeatedly, because each one only ends up
yielding a short hex string so we don't reach the description length
limit very easily.

With this commit we provide a more efficient implementation of
`CreateSnapshotTask#toString` and also override
`RepositoryData#toString` to protect against some other caller running
into the same issue.
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.17
8.16
8.18
8.19 The branch "8.19" is invalid or doesn't exist

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 121283

DaveCTurner added a commit that referenced this pull request Jan 30, 2025
If the `MasterService` needs to log a create-snapshot task description
then it will call `CreateSnapshotTask#toString`, which today calls
`RepositoryData#toString` which is not overridden so ends up calling
`RepositoryData#hashCode`. This can be extraordinarily expensive in a
large repository. Worse, if there's masses of create-snapshot tasks to
execute then it'll do this repeatedly, because each one only ends up
yielding a short hex string so we don't reach the description length
limit very easily.

With this commit we provide a more efficient implementation of
`CreateSnapshotTask#toString` and also override
`RepositoryData#toString` to protect against some other caller running
into the same issue.
DaveCTurner added a commit that referenced this pull request Jan 30, 2025
If the `MasterService` needs to log a create-snapshot task description
then it will call `CreateSnapshotTask#toString`, which today calls
`RepositoryData#toString` which is not overridden so ends up calling
`RepositoryData#hashCode`. This can be extraordinarily expensive in a
large repository. Worse, if there's masses of create-snapshot tasks to
execute then it'll do this repeatedly, because each one only ends up
yielding a short hex string so we don't reach the description length
limit very easily.

With this commit we provide a more efficient implementation of
`CreateSnapshotTask#toString` and also override
`RepositoryData#toString` to protect against some other caller running
into the same issue.
@DaveCTurner
Copy link
Contributor Author

Missing backports:

elasticsearchmachine pushed a commit that referenced this pull request Jan 30, 2025
If the `MasterService` needs to log a create-snapshot task description
then it will call `CreateSnapshotTask#toString`, which today calls
`RepositoryData#toString` which is not overridden so ends up calling
`RepositoryData#hashCode`. This can be extraordinarily expensive in a
large repository. Worse, if there's masses of create-snapshot tasks to
execute then it'll do this repeatedly, because each one only ends up
yielding a short hex string so we don't reach the description length
limit very easily.

With this commit we provide a more efficient implementation of
`CreateSnapshotTask#toString` and also override
`RepositoryData#toString` to protect against some other caller running
into the same issue.
elasticsearchmachine pushed a commit that referenced this pull request Jan 30, 2025
If the `MasterService` needs to log a create-snapshot task description
then it will call `CreateSnapshotTask#toString`, which today calls
`RepositoryData#toString` which is not overridden so ends up calling
`RepositoryData#hashCode`. This can be extraordinarily expensive in a
large repository. Worse, if there's masses of create-snapshot tasks to
execute then it'll do this repeatedly, because each one only ends up
yielding a short hex string so we don't reach the description length
limit very easily.

With this commit we provide a more efficient implementation of
`CreateSnapshotTask#toString` and also override
`RepositoryData#toString` to protect against some other caller running
into the same issue.
elasticsearchmachine pushed a commit that referenced this pull request Jan 30, 2025
If the `MasterService` needs to log a create-snapshot task description
then it will call `CreateSnapshotTask#toString`, which today calls
`RepositoryData#toString` which is not overridden so ends up calling
`RepositoryData#hashCode`. This can be extraordinarily expensive in a
large repository. Worse, if there's masses of create-snapshot tasks to
execute then it'll do this repeatedly, because each one only ends up
yielding a short hex string so we don't reach the description length
limit very easily.

With this commit we provide a more efficient implementation of
`CreateSnapshotTask#toString` and also override
`RepositoryData#toString` to protect against some other caller running
into the same issue.
@DaveCTurner
Copy link
Contributor Author

For the record, I saw a cluster hit this issue with ~10k create-snapshot tasks in its queue thanks to ILM, ~4k snapshots in its repository and ~20k indices. It took about 4-and-a-half hours to generate the log message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) >bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Distributed Coordination Meta label for Distributed Coordination team v8.16.4 v8.17.2 v8.18.1 v8.19.0 v9.0.1 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants