[ML][Inference] document new settings (#49309) (#49336)

benwtrent · web-flow · commit d068818b1672 · 2019-11-19T16:43:19.000-05:00
* [ML][Inference] document new settings

* [DOCS] Minor edits
diff --git a/docs/reference/settings/ml-settings.asciidoc b/docs/reference/settings/ml-settings.asciidoc
@@ -50,6 +50,25 @@ IMPORTANT: If you want to use {ml} features in your cluster, you must have
 `xpack.ml.enabled` set to `true` on all master-eligible nodes. This is the
 default behavior.
 
+`xpack.ml.inference_model.cache_size`::
+The maximum inference cache size allowed. The inference cache exists in the JVM
+heap on each ingest node. The cache affords faster processing times for the
+`inference` processor. The value can be a static byte sized value (i.e. "2gb")
+or a percentage of total allocated heap. The default is "40%".
+
+`xpack.ml.inference_model.time_to_live`::
+The time to live (TTL) for models in the inference model cache. The TTL is
+calculated from last access. The `inference` processor attempts to load the
+model from cache. If the `inference` processor does not receive any documents
+for the duration of the TTL, the referenced model is flagged for eviction from
+the cache. If a document is processed later, the model is again loaded into the
+cache. Defaults to `5m`.
+
+`xpack.ml.max_inference_processors` (<<cluster-update-settings,Dynamic>>)::
+The total number of `inference` type processors allowed across all ingest
+pipelines. Once the limit is reached, adding an `inference` processor to
+a pipeline is disallowed. Defaults to `50`.
+
 `xpack.ml.max_machine_memory_percent` (<<cluster-update-settings,Dynamic>>)::
 The maximum percentage of the machine's memory that {ml} may use for running
 analytics processes. (These processes are separate to the {es} JVM.) Defaults to