-
Notifications
You must be signed in to change notification settings - Fork 25.2k
[ML] Handle new actual_memory_usage_bytes
field in model size stats.
#126256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The `actual_memory_usage_bytes` field represents the real, physical memory allocated to the `autodetect` process as reported by the OS. Reporting this value in the model size stats associated with an AD job is useful, especially in OOM situations.
Hi @edsavage, I've created a changelog YAML for you. |
Pinging @elastic/ml-core (Team:ML) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Please can you add a test that the new field appears in the Get model stats API. Add it to this test
elasticsearch/x-pack/plugin/src/yamlRestTest/resources/rest-api-spec/test/ml/jobs_get_stats.yml
Line 104 in f1b1983
- gte: { jobs.0.model_size_stats.model_bytes: 0 } |
|
||
/** | ||
* Reference to the most recent Ml config version. | ||
* This should be the Ml config version with the highest id. | ||
*/ | ||
public static final MlConfigVersion CURRENT = V_12; | ||
public static final MlConfigVersion CURRENT = V_13; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the config version change necessary? Model size stats are considered to be a result type not a config type and are stored in the .ml-anomalies-x indices.
ModelSizeStats has 2 parsers; a strict parser that errors if it does not recognise a field and a lenient parser that ignores unknown fields. The strict parser is used reading the output from autodetect and the lenient is used reading the documents from Elasticsearch. This way in a mixed version cluster if a upgraded node stores the stats doc with the actual_memory_usage_bytes
field then when an old node reads the doc it won't error when it sees the unknown actual_memory_usage_bytes
field. This mechanism works well for backwards compatibility and means that most of the time versioning is not required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL, Thanks Dave!
The
actual_memory_usage_bytes
field represents the real, physical memory allocated to theautodetect
process as reported by the OS. Reporting this value in the model size stats associated with an AD job is useful, especially in OOM situations.Relates elastic/ml-cpp#2846