@@ -50,6 +50,25 @@ IMPORTANT: If you want to use {ml} features in your cluster, you must have
50
50
`xpack.ml.enabled` set to `true` on all master-eligible nodes. This is the
51
51
default behavior.
52
52
53
+ `xpack.ml.inference_model.cache_size`::
54
+ The maximum inference cache size allowed. The inference cache exists in the JVM
55
+ heap on each ingest node. The cache affords faster processing times for the
56
+ `inference` processor. The value can be a static byte sized value (i.e. "2gb")
57
+ or a percentage of total allocated heap. The default is "40%".
58
+
59
+ `xpack.ml.inference_model.time_to_live`::
60
+ The time to live (TTL) for models in the inference model cache. The TTL is
61
+ calculated from last access. The `inference` processor attempts to load the
62
+ model from cache. If the `inference` processor does not receive any documents
63
+ for the duration of the TTL, the referenced model is flagged for eviction from
64
+ the cache. If a document is processed later, the model is again loaded into the
65
+ cache. Defaults to `5m`.
66
+
67
+ `xpack.ml.max_inference_processors` (<<cluster-update-settings,Dynamic>>)::
68
+ The total number of `inference` type processors allowed across all ingest
69
+ pipelines. Once the limit is reached, adding an `inference` processor to
70
+ a pipeline is disallowed. Defaults to `50`.
71
+
53
72
`xpack.ml.max_machine_memory_percent` (<<cluster-update-settings,Dynamic>>)::
54
73
The maximum percentage of the machine's memory that {ml} may use for running
55
74
analytics processes. (These processes are separate to the {es} JVM.) Defaults to
0 commit comments