Skip to content

Frozen default cache size #71844

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

henningandersen
Copy link
Contributor

@henningandersen henningandersen commented Apr 19, 2021

This commit adds a default cache size to frozen tier of the greater of
90% and total disk size minus 100 GB.

Will follow-up with two other items:

  1. Fail nodes on startup when frozen cache has a size on multiple data paths. The tricky bit here is mainly the test framework. (added in Reject multiple data paths on frozen capable nodes #71896)
  2. Add a frozen specific flood stage setting that has a max headroom of 20GB (added in Add separate flood stage limit for frozen #71855)

This commit adds a default cache size to frozen tier of the greater of
90% and total disk size minus 100 GB.

Additionally, configuring a frozen cache is now warned against on nodes
with multiple data paths.
@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Apr 19, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@henningandersen henningandersen requested a review from ywelsch April 19, 2021 15:00
henningandersen added a commit to henningandersen/elasticsearch that referenced this pull request Apr 19, 2021
Dedicated frozen nodes can survive less headroom than other data nodes.
This commits introduces a separate flood stage threshold for frozen as
well as an accompanying max_headroom setting that caps the amount of
free space necessary on frozen.

Relates elastic#71844
henningandersen added a commit to henningandersen/elasticsearch that referenced this pull request Apr 19, 2021
Some functionality will no longer work with multiple data paths and in
order to run integration tests for that, we need the capability to
force a single data path for those tests.

Relates elastic#71844
@henningandersen
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/2

henningandersen added a commit that referenced this pull request Apr 20, 2021
Some functionality will no longer work with multiple data paths and in
order to run integration tests for that, we need the capability to
force a single data path for those tests.

Relates #71844
henningandersen added a commit that referenced this pull request Apr 20, 2021
Some functionality will no longer work with multiple data paths and in
order to run integration tests for that, we need the capability to
force a single data path for those tests.

Relates #71844
henningandersen added a commit to henningandersen/elasticsearch that referenced this pull request Apr 20, 2021
Nodes with a frozen cache no longer supports multiple data paths. This
simplifies cache sizing and avoids the need to support multiple cache
files.

Relates elastic#71844
@henningandersen henningandersen requested a review from tlrx April 20, 2021 07:53
henningandersen added a commit to henningandersen/elasticsearch that referenced this pull request Apr 20, 2021
Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left only minor comments, o.w. looking good.

<<data-frozen-node,`data_frozen`>> role.

IMPORTANT: You can only configure these settings on nodes that use a single
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps more of a question to the docs team: I wonder if we should have two "important" notices following each other, or whether it should be all folded into one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to find other places with two, but found none and did find some where it could have been split into two. So I collapsed it to one in 1b8ede4

Setting.Property.NodeScope
);

private static boolean isDedicatedFrozen(Settings settings) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised this method did not already exist.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add this method to DiscoveryNode?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I centralized this to DiscoveryNode in 616940c

this.cacheSize = SNAPSHOT_CACHE_SIZE_SETTING.get(settings).getBytes();
FsInfo.Path pathInfo;
try {
pathInfo = FsProbe.getFSInfo(environment.nodePaths()[0]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

environment.nodeDataPaths()[0] here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 7280ae8

Copy link
Member

@tlrx tlrx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

must have one or more nodes with a shared cache available. By default,
dedicated frozen data tier nodes (nodes with the `data_frozen` role and no other
data roles) have a shared cache configured using the greater of 90% of total
disk space and total disk space subtracted a headroom of 100GB.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if inserting an example here would help.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, added examples in 53f5982

<<data-frozen-node,`data_frozen`>> role.

IMPORTANT: You can only configure these settings on nodes that use a single
<<path-settings,data path>>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<<path-settings,data path>>
<<path-settings,data path>>.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed this when turning it into one IMPORTANT note in 1b8ede4

double value = (double) randomIntBetween(1, 100) / 100;
RelativeByteSizeValue parsed = RelativeByteSizeValue.parseRelativeByteSizeValue(Double.toString(value), "test");
assertThat(parsed.getRatio().getAsRatio(),
equalTo(value));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: fits on the same line as the previous one

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, see 3160c5f

Setting.Property.NodeScope
);

private static boolean isDedicatedFrozen(Settings settings) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add this method to DiscoveryNode?

@henningandersen henningandersen requested a review from ywelsch April 20, 2021 11:46
Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

try {
totalFsSize = FsProbe.getTotal(Environment.getFileStore(environment.nodeDataPaths()[0]));
} catch (IOException e) {
throw new IllegalStateException("unable to probe size of filesystem [" + environment.nodePaths()[0] + "]");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also change to environment.nodeDataPaths

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, sorry for missing that one 1b3e031

henningandersen added a commit that referenced this pull request Apr 20, 2021
Nodes with a frozen cache no longer supports multiple data paths. This
simplifies cache sizing and avoids the need to support multiple cache
files.

Relates #71844
henningandersen added a commit to henningandersen/elasticsearch that referenced this pull request Apr 20, 2021
Nodes with a frozen cache no longer supports multiple data paths. This
simplifies cache sizing and avoids the need to support multiple cache
files.

Relates elastic#71844
henningandersen added a commit that referenced this pull request Apr 20, 2021
Dedicated frozen nodes can survive less headroom than other data nodes.
This commits introduces a separate flood stage threshold for frozen as
well as an accompanying max_headroom setting that caps the amount of
free space necessary on frozen.

Relates #71844
henningandersen added a commit to henningandersen/elasticsearch that referenced this pull request Apr 20, 2021
Dedicated frozen nodes can survive less headroom than other data nodes.
This commits introduces a separate flood stage threshold for frozen as
well as an accompanying max_headroom setting that caps the amount of
free space necessary on frozen.

Relates elastic#71844
henningandersen added a commit that referenced this pull request Apr 20, 2021
Nodes with a frozen cache no longer supports multiple data paths. This
simplifies cache sizing and avoids the need to support multiple cache
files.

Relates #71844
henningandersen added a commit that referenced this pull request Apr 20, 2021
Dedicated frozen nodes can survive less headroom than other data nodes.
This commits introduces a separate flood stage threshold for frozen as
well as an accompanying max_headroom setting that caps the amount of
free space necessary on frozen.

Relates #71844
@henningandersen henningandersen merged commit c57fbe8 into elastic:master Apr 20, 2021
henningandersen added a commit to henningandersen/elasticsearch that referenced this pull request Apr 20, 2021
This commit adds a default cache size to frozen tier of the greater of
90% and total disk size minus 100 GB.
henningandersen added a commit that referenced this pull request Apr 20, 2021
This commit adds a default cache size to frozen tier of the greater of
90% and total disk size minus 100 GB.
jrodewig added a commit that referenced this pull request May 5, 2021
Changes:

* Renames 'full copy searchable snapshot' to 'fully mounted index.'
* Renames 'shared cache searchable snapshot' to 'partially mounted index.'
* Removes some unneeded cache setup instructions for the frozen tier. We added a default cache size with #71844.
jrodewig added a commit that referenced this pull request May 5, 2021
Changes:

* Renames 'full copy searchable snapshot' to 'fully mounted index.'
* Renames 'shared cache searchable snapshot' to 'partially mounted index.'
* Removes some unneeded cache setup instructions for the frozen tier. We added a default cache size with #71844.
jrodewig added a commit that referenced this pull request May 5, 2021
Changes:

* Renames 'full copy searchable snapshot' to 'fully mounted index.'
* Renames 'shared cache searchable snapshot' to 'partially mounted index.'
* Removes some unneeded cache setup instructions for the frozen tier. We added a default cache size with #71844.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >enhancement Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v7.13.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants