Skip to content

[DOCS] Adds ml-cpp PRs to release notes #52158

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Feb 11, 2020
112 changes: 62 additions & 50 deletions docs/reference/release-notes/7.6.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -76,9 +76,11 @@ Features/Ingest::

Machine Learning::
* Implement `precision` and `recall` metrics for classification evaluation {pull}49671[#49671] (issue: {issue}48759[#48759])
* [ML] Explain data frame analytics API {pull}49455[#49455]
* [ML] ML Model Inference Ingest Processor {pull}49052[#49052]
* Implement accuracy metric for multiclass classification {pull}47772[#47772] (issue: {issue}48759[#48759])
* Explain data frame analytics API {pull}49455[#49455]
* Machine learning model inference ingest processor {pull}49052[#49052]
* Implement accuracy metric for multi-class classification {pull}47772[#47772] (issue: {issue}48759[#48759])
* Add feature importance values to classification and regression results (using tree
SHapley Additive exPlanation, or SHAP) {ml-pull}857[#857]

Mapping::
* Add per-field metadata. {pull}49419[#49419] (issue: {issue}33267[#33267])
Expand Down Expand Up @@ -222,23 +224,43 @@ License::
* Support "enterprise" license types {pull}49223[#49223] (issue: {issue}48510[#48510])

Machine Learning::
* [ML] Add audit warning for 1000 categories found early in job {pull}51146[#51146] (issue: {issue}50749[#50749])
* [ML] Add num_top_feature_importance_values param to regression and classi… {pull}50914[#50914]
* [ML] Implement force deleting a data frame analytics job {pull}50553[#50553] (issue: {issue}48124[#48124])
* [ML] Delete unused data frame analytics state {pull}50243[#50243]
* Add audit warning for 1000 categories found early in job {pull}51146[#51146] (issue: {issue}50749[#50749])
* Add `num_top_feature_importance_values` param to regression and classification {pull}50914[#50914]
* Implement force deleting a data frame analytics job {pull}50553[#50553] (issue: {issue}48124[#48124])
* Delete unused data frame analytics state {pull}50243[#50243]
* Make each analysis report desired field mappings to be copied {pull}50219[#50219] (issue: {issue}50119[#50119])
* [ML] retry bulk indexing of state docs {pull}50149[#50149] (issue: {issue}50143[#50143])
* [ML] Persist/restore state for DFA classification {pull}50040[#50040]
* [ML] Introduce randomize_seed setting for regression and classification {pull}49990[#49990]
* Retry bulk indexing of state docs {pull}50149[#50149] (issue: {issue}50143[#50143])
* Persist/restore state for data frame analytics classification {pull}50040[#50040]
* Introduce `randomize_seed` setting for regression and classification {pull}49990[#49990]
* Pass `prediction_field_type` to C++ analytics process {pull}49861[#49861] (issue: {issue}49796[#49796])
* [ML] Add optional source filtering during data frame reindexing {pull}49690[#49690] (issue: {issue}49531[#49531])
* [ML] Add default categorization analyzer definition to ML info {pull}49545[#49545]
* [ML] Add graceful retry for anomaly detector result indexing failures {pull}49508[#49508] (issue: {issue}45711[#45711])
* Lower minimum model memory limit value from 1MB to 1kB. {pull}49227[#49227] (issue: {issue}49168[#49168])
* Throw an exception when memory usage estimation endpoint encounters empty data frame. {pull}49143[#49143] (issue: {issue}49140[#49140])
* Change format of MulticlassConfusionMatrix result to be more self-explanatory {pull}48174[#48174] (issue: {issue}46735[#46735])
* Make num_top_classes parameter's default value equal to 2 {pull}48119[#48119] (issue: {issue}46735[#46735])
* [ML] Improve model_memory_limit UX for data frame analytics jobs {pull}44699[#44699]
* Add optional source filtering during data frame reindexing {pull}49690[#49690] (issue: {issue}49531[#49531])
* Add default categorization analyzer definition to ML info {pull}49545[#49545]
* Add graceful retry for anomaly detector result indexing failures {pull}49508[#49508] (issue: {issue}45711[#45711])
* Lower minimum model memory limit value for data frame analytics jobs from 1MB to 1kB {pull}49227[#49227] (issue: {issue}49168[#49168])
* Improve `model_memory_limit` user experience for data frame analytics jobs {pull}44699[#44699]
* Improve performance of boosted tree training for both classification and regression {ml-pull}775[#775]
* Reduce the peak memory used by boosted tree training and fix an overcounting bug
estimating maximum memory usage {ml-pull}781[#781]
* Stratified fractional cross validation for regression {ml-pull}784[#784]
* Added `geo_point` supported output for `lat_long` function records {ml-pull}809[#809], {pull}47050[#47050]
* Use a random bag of the data to compute the loss function derivatives for each
new tree which is trained for both regression and classification {ml-pull}811[#811]
* Emit `prediction_probability` field alongside prediction field in ml results {ml-pull}818[#818]
* Reduce memory usage of {ml} native processes on Windows {ml-pull}844[#844]
* Reduce runtime of classification and regression {ml-pull}863[#863]
* Stop early training a classification and regression forest when the validation
error is no longer decreasing {ml-pull}875[#875]
* Emit `prediction_field_name` in data frame analytics results using the type
provided as `prediction_field_type` parameter {ml-pull}877[#877]
* Improve performance updating quantile estimates {ml-pull}881[#881]
* Migrate to use Bayesian optimisation for initial hyperparameter value line
searches and stop early if the expected improvement is too small {ml-pull}903[#903]
* Stop cross-validation early if the predicted test loss has a small chance of
being smaller than for the best parameter values found so far {ml-pull}915[#915]
* Optimize decision threshold for classification to maximize minimum class recall {ml-pull}926[#926]
* Include categorization memory usage in the `model_bytes` field in
`model_size_stats`, so that it is taken into account in node assignment
decisions {ml-pull}927[#927] (issue:{ml-issue}724[#724])

Mapping::
* Add telemetry for flattened fields. {pull}48972[#48972]
Expand Down Expand Up @@ -297,11 +319,11 @@ Store::
* mmap dim files in HybridDirectory {pull}49272[#49272] (issue: {issue}48509[#48509])

Transform::
* [Transform] Improve force stop robustness in case of an error {pull}51072[#51072]
* [Transform] add actual timeout in message {pull}50140[#50140]
* [Transform] automatic deletion of old checkpoints {pull}49496[#49496]
* [Transform] improve error handling of script errors {pull}48887[#48887] (issue: {issue}48467[#48467])
* [ML][Transforms] add wait_for_checkpoint flag to stop {pull}47935[#47935] (issue: {issue}45293[#45293])
* Improve force stop robustness in case of an error {pull}51072[#51072]
* Add actual timeout in message {pull}50140[#50140]
* Automatic deletion of old checkpoints {pull}49496[#49496]
* Improve error handling of script errors {pull}48887[#48887] (issue: {issue}48467[#48467])
* Add `wait_for_checkpoint` flag to stop {pull}47935[#47935] (issue: {issue}45293[#45293])



Expand Down Expand Up @@ -447,28 +469,21 @@ Infra/REST API::
* Slash missed in indices.put_mapping url {pull}49468[#49468]

Machine Learning::
* [ML] Fix 2 digit year regex in find_file_structure {pull}51469[#51469]
* [ML] Validate classification dependent_variable cardinality is at lea… {pull}51232[#51232]
* Fix 2 digit year regex in find_file_structure {pull}51469[#51469]
* Validate classification `dependent_variable` cardinality is at least two {pull}51232[#51232]
* Do not copy mapping from dependent variable to prediction field in regression analysis {pull}51227[#51227]
* Handle nested and aliased fields correctly when copying mapping. {pull}50918[#50918] (issue: {issue}50787[#50787])
* [ML] Fix off-by-one error in ml_classic tokenizer end offset {pull}50655[#50655]
* [ML] Improve uniqueness of result document IDs {pull}50644[#50644] (issue: {issue}50613[#50613])
* [7.x] Synchronize processInStream.close() call {pull}50581[#50581] (issue: {issue}49680[#49680])
* Fix accuracy metric {pull}50310[#50310] (issue: {issue}48759[#48759])
* Handle nested and aliased fields correctly when copying mapping {pull}50918[#50918] (issue: {issue}50787[#50787])
* Fix off-by-one error in `ml_classic` tokenizer end offset {pull}50655[#50655]
* Improve uniqueness of result document IDs {pull}50644[#50644] (issue: {issue}50613[#50613])
* Fix accuracy metric in multi-class confusion matrix {pull}50310[#50310] (issue: {issue}48759[#48759])
* Fix race condition when stopping a data frame analytics jobs immediately after starting it {pull}50276[#50276] (issues: {issue}49680[#49680], {issue}50177[#50177])
* Use query in cardinality check {pull}49939[#49939]
* Make only a part of `stop()` method a critical section. {pull}49756[#49756] (issue: {issue}49680[#49680])
* Fix expired job results deletion audit message {pull}49560[#49560] (issue: {issue}49549[#49549])
* [ML] Apply source query on data frame analytics memory estimation {pull}49517[#49517] (issue: {issue}49454[#49454])
* Stop timing stats failure propagation {pull}49495[#49495]
* [ML] Fix r_squared eval when variance is 0 {pull}49439[#49439]
* Blacklist a number of prediction field names. {pull}49371[#49371] (issue: {issue}48808[#48808])
* Make AnalyticsProcessManager class more robust {pull}49282[#49282] (issue: {issue}49095[#49095])
* [ML] Fixes for stop datafeed edge cases {pull}49191[#49191] (issues: {issue}43670[#43670], {issue}48931[#48931])
* [ML] Avoid NPE when node load is calculated on job assignment {pull}49186[#49186] (issue: {issue}49150[#49150])
* Do not throw exceptions resulting from persisting datafeed timing stats. {pull}49044[#49044] (issue: {issue}49032[#49032])
* [ML] Deduplicate multi-fields for data frame analytics {pull}48799[#48799] (issues: {issue}48756[#48756], {issue}48770[#48770])
* [ML] Prevent fetching multi-field from source {pull}48770[#48770] (issue: {issue}48756[#48756])
* Apply source query on data frame analytics memory estimation {pull}49517[#49517] (issue: {issue}49454[#49454])
* Fix r_squared eval when variance is 0 {pull}49439[#49439]
* Blacklist a number of prediction field names {pull}49371[#49371] (issue: {issue}48808[#48808])
* Make data frame analytics more robust for very short-lived analyses {pull}49282[#49282] (issue: {issue}49095[#49095])
* Fixes potential memory corruption when determining seasonality {ml-pull}852[#852]
* Prevent `prediction_field_name` clashing with other fields in {ml} results {ml-pull}861[#861]
* Include out-of-order as well as in-order terms in categorization reverse searches {ml-pull}950[#950] (issue:{ml-issue}949[#949])

Mapping::
* Ensure that field collapsing works with field aliases. {pull}50722[#50722] (issues: {issue}32648[#32648], {issue}50121[#50121])
Expand Down Expand Up @@ -532,11 +547,11 @@ Snapshot/Restore::
* Cleanup Concurrent RepositoryData Loading {pull}48329[#48329] (issue: {issue}48122[#48122])

Transform::
* [Transform] Fix mapping deduction for scaled_float {pull}51990[#51990] (issue: {issue}51780[#51780])
* [Transform] Fix stats can return old state information if security is enabled {pull}51732[#51732] (issue: {issue}51728[#51728])
* [Transform] fail to start/put on missing pipeline {pull}50701[#50701] (issue: {issue}50135[#50135])
* Fix mapping deduction for scaled_float {pull}51990[#51990] (issue: {issue}51780[#51780])
* Fix stats can return old state information if security is enabled {pull}51732[#51732] (issue: {issue}51728[#51728])
* Fail to start/put on missing pipeline {pull}50701[#50701] (issue: {issue}50135[#50135])
* Fix possible audit logging disappearance after rolling upgrade {pull}49731[#49731] (issue: {issue}49730[#49730])
* [Transform] do not fail checkpoint creation due to global checkpoint mismatch {pull}48423[#48423] (issue: {issue}48379[#48379])
* Do not fail checkpoint creation due to global checkpoint mismatch {pull}48423[#48423] (issue: {issue}48379[#48379])



Expand All @@ -549,6 +564,3 @@ Engine::

Infra/Packaging::
* Upgrade the bundled JDK to JDK 13.0.2 {pull}51511[#51511]