-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Support max_single_primary_size in Resize Action and exposed in ILM #67705
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Pinging @elastic/es-core-features (Team:Core/Features) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for opening this @gaobinlong, I took a look and left some initial thoughts.
I also think we should use max_single_shard_size
, because for this particular application, the difference between primaries and replicas is irrelevant (shrink only deals with primary shards anyway), and it makes it a bit shorter and easier to read.
if (minShardsNum > sourceIndexShardsNum) { | ||
throw new IllegalArgumentException("The target index's shards number[" + minShardsNum + | ||
"] is greater than the source index's shards number[" + sourceIndexShardsNum + "]"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this should throw an error, or if it would be better to allow the shrink with the same number of shards as are currently available (basically min(sourceIndexShardsNum, minShardsNum)
)
I am on the fence about this, since it can make the ILM usage almost useless if time-based rollover is used (imagine a period of high ingestion where an index with 2 primary shards has trouble initially rolling over, but finally does at 200gb, and then tries to shrink to 50gb
shards and gets stuck in an error forever because it's trying to "shrink" to 4 shards)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a problem, I have set the target index's shards number to sourceIndexShardsNum
if minShardsNum
is greater than sourceIndexShardsNum
, I have also added a log in the code and some description in the document about this.
x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ilm/ShrinkAction.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ilm/ShrinkAction.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ilm/ShrinkAction.java
Outdated
Show resolved
Hide resolved
} | ||
|
||
private CreateIndexRequest targetIndexRequest; | ||
private String sourceIndex; | ||
private ResizeType type = ResizeType.SHRINK; | ||
private Boolean copySettings = true; | ||
private ByteSizeValue maxSinglePrimaryShardSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class is missing serialization of this value (with version checks) in the StreamInput
constructor and writeTo
method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also notice this class is missing a hashCode
and equals
implementation, which means that we can never compare them. We should remedy that, but it can be done in subsequent work (doesn't have to be done in this PR)
(this is not caught because the tests need to be enhanced to check for transport serialization, that can be done as the subsequent work also)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added some serialization code in the StreamInput
constructor and writeTo
method yet, and
I'm glad to open another PR to add the missing hashCode
and equals
method.
x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ilm/ShrinkStep.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/action/admin/indices/shrink/TransportResizeAction.java
Outdated
Show resolved
Hide resolved
@dakrone @gaobinlong should we rename this to |
Yes I think we should rename it to |
OK, I will rename the parameter to |
@dakrone ,I've renamed the parameter to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this @gaobinlong, this looks much closer, I left some more comments on this! (hopefully my suggestions compile, I did not test them so they may need to be checked)
The max single primary shard size of the target index, it's used to find an optimum shards number of the target index. | ||
When this parameter is set properly, each shard's storage of the target index will not be greater than the parameter, | ||
while the shards number of the target index still be a factor of the source index's shards number, but if the parameter | ||
is less than the single shard size of the source index, the shards number of the target index will be equal to the source index's shards number. | ||
For example, when this parameter is set to `50GB`, if the source index has `60` primary shards with `100GB` storage, then the | ||
target index will have `2` primary shards, each shard's storage is `50GB`; if the source index has `60` primary shards | ||
with `1000GB` storage, then the target index will have `20` primary shards; if the source index has `60` primary shards | ||
with `4000GB` storage, then the target index will still have `60` primary shards. This parameter is conflict | ||
with `number_of_shards` in the `settings` parameter, either of them can be set. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The max single primary shard size of the target index, it's used to find an optimum shards number of the target index. | |
When this parameter is set properly, each shard's storage of the target index will not be greater than the parameter, | |
while the shards number of the target index still be a factor of the source index's shards number, but if the parameter | |
is less than the single shard size of the source index, the shards number of the target index will be equal to the source index's shards number. | |
For example, when this parameter is set to `50GB`, if the source index has `60` primary shards with `100GB` storage, then the | |
target index will have `2` primary shards, each shard's storage is `50GB`; if the source index has `60` primary shards | |
with `1000GB` storage, then the target index will have `20` primary shards; if the source index has `60` primary shards | |
with `4000GB` storage, then the target index will still have `60` primary shards. This parameter is conflict | |
with `number_of_shards` in the `settings` parameter, either of them can be set. | |
The max single primary shard size for the target index. Used to find the optimum number of shards for the target index. | |
When this parameter is set, each shard's storage in the target index will not be greater than the parameter. | |
The shards count of the target index will still be a factor of the source index's shards count, but if the parameter | |
is less than the single shard size in the source index, the shards count for the target index will be equal to the source index's shard count. | |
For example, when this parameter is set to 50gb, if the source index has 60 primary shards with totaling 100gb, then the | |
target index will have 2 primary shards, with each shard size of 50gb; if the source index has 60 primary shards | |
totaling 1000gb, then the target index will have 20 primary shards; if the source index has 60 primary shards | |
totaling 4000gb, then the target index will still have 60 primary shards. This parameter conflicts | |
with `number_of_shards` in the `settings`, only one of them may be set. |
"warm": { | ||
"actions": { | ||
"shrink" : { | ||
"max_single_primary_size": "50GB" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"max_single_primary_size": "50GB" | |
"max_single_primary_size": "50gb" |
The max single primary shard size of the target index, it's used to find an optimum shards number of the target index. | ||
When this parameter is set properly, each shard's storage of the target index will not be greater than the parameter, | ||
while the shards number of the target index still be a factor of the source index's shards number, but if the parameter | ||
is less than the single shard size of the source index, the shards number of the target index will be equal to the source index's shards number. | ||
For example, when this parameter is set to `50GB`, if the source index has `60` primary shards with `100GB` storage, then the | ||
target index will have `2` primary shards, each shard's storage is `50GB`; if the source index has `60` primary shards | ||
with `1000GB` storage, then the target index will have `20` primary shards; if the source index has `60` primary shards | ||
with `4000GB` storage, then the target index will still have `60` primary shards. This parameter is conflict | ||
with `number_of_shards` in the `settings` parameter, either of them can be set. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this I think you can copy the changes made in the previous doc (rather than duplicate them here)
boolean hasMaxSinglePrimarySize = maxSinglePrimarySize != null; | ||
out.writeBoolean(hasMaxSinglePrimarySize); | ||
if (hasMaxSinglePrimarySize) { | ||
maxSinglePrimarySize.writeTo(out); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be simplified to
boolean hasMaxSinglePrimarySize = maxSinglePrimarySize != null; | |
out.writeBoolean(hasMaxSinglePrimarySize); | |
if (hasMaxSinglePrimarySize) { | |
maxSinglePrimarySize.writeTo(out); | |
} | |
out.writeOptionalWriteable(maxSinglePrimarySize); |
logger.info("By setting max_single_primary_size to [" + maxSinglePrimarySize.toString() + | ||
"], the target index [" + targetIndexName + "] will contain [" + minShardsNum + | ||
"] shards, which will be greater than [" + sourceIndexShardsNum + | ||
"] shards of the source index [" + sourceMetadata.getIndex().getName() + | ||
"], use [" + sourceIndexShardsNum + "] as the shards number of the target index [" + targetIndexName + "]"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be cleaned up using the {}
substitution
logger.info("By setting max_single_primary_size to [" + maxSinglePrimarySize.toString() + | |
"], the target index [" + targetIndexName + "] will contain [" + minShardsNum + | |
"] shards, which will be greater than [" + sourceIndexShardsNum + | |
"] shards of the source index [" + sourceMetadata.getIndex().getName() + | |
"], use [" + sourceIndexShardsNum + "] as the shards number of the target index [" + targetIndexName + "]"); | |
logger.info("By setting max_single_primary_size to [{}], the target index [{}] will contain [{}] shards, which will be greater than [{}] shards in the source index [{}], using [{}] for the shard count of the target index [{}]", | |
maxSinglePrimarySize.toString(), targetIndexName, minShardsNum, sourceIndexShardsNum, sourceMetadata.getIndex().getName(), sourceIndexShardsNum, targetIndexName); |
@elasticmachine test this please |
@elasticmachine ok to test |
Thanks for iterating on this @gaobinlong! It looks good to me! |
@elasticmachine update branch |
7.x backport in #68321 |
…67705) (#68321) Co-authored-by: bellengao <[email protected]>
Adjust ResizeRequest and ShrinkAction serialization now that elastic#68321 has been merged, and re-enable BWC testing. Related to elastic#67705, elastic#68321, and elastic#68329
Relates to #65714.
The main changes are:
max_single_primary_size
parameter in Resize Action, it's used to calculate an appropriate shards number of the target index when executing the Shrink API.max_single_primary_size
parameter in ILM's Shrink Action.