Skip to content

Commit 9bd5ddd

Browse files
authored
[DOCS] Resolves merge conflicts. (elastic#91611)
1 parent 6678164 commit 9bd5ddd

File tree

3 files changed

+48
-37
lines changed

3 files changed

+48
-37
lines changed

docs/reference/ml/trained-models/apis/infer-trained-model-deployment.asciidoc

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -46,8 +46,8 @@ Controls the amount of time to wait for {infer} results. Defaults to 10 seconds.
4646
`docs`::
4747
(Required, array)
4848
An array of objects to pass to the model for inference. The objects should
49-
contain a field matching your configured trained model input. Typically, the field
50-
name is `text_field`. Currently, only a single value is allowed.
49+
contain a field matching your configured trained model input. Typically, the
50+
field name is `text_field`. Currently, only a single value is allowed.
5151

5252
////
5353
[[infer-trained-model-deployment-results]]
@@ -62,8 +62,8 @@ name is `text_field`. Currently, only a single value is allowed.
6262
[[infer-trained-model-deployment-example]]
6363
== {api-examples-title}
6464

65-
The response depends on the task the model is trained for. If it is a
66-
text classification task, the response is the score. For example:
65+
The response depends on the task the model is trained for. If it is a text
66+
classification task, the response is the score. For example:
6767

6868
[source,console]
6969
--------------------------------------------------
@@ -123,8 +123,8 @@ The API returns in this case:
123123
----
124124
// NOTCONSOLE
125125

126-
Zero-shot classification tasks require extra configuration defining the class labels.
127-
These labels are passed in the zero-shot inference config.
126+
Zero-shot classification tasks require extra configuration defining the class
127+
labels. These labels are passed in the zero-shot inference config.
128128

129129
[source,console]
130130
--------------------------------------------------
@@ -150,7 +150,8 @@ POST _ml/trained_models/model2/deployment/_infer
150150
--------------------------------------------------
151151
// TEST[skip:TBD]
152152

153-
The API returns the predicted label and the confidence, as well as the top classes:
153+
The API returns the predicted label and the confidence, as well as the top
154+
classes:
154155

155156
[source,console-result]
156157
----
@@ -204,8 +205,8 @@ POST _ml/trained_models/model2/deployment/_infer
204205
--------------------------------------------------
205206
// TEST[skip:TBD]
206207

207-
When the input has been truncated due to the limit imposed by the model's `max_sequence_length`
208-
the `is_truncated` field appears in the response.
208+
When the input has been truncated due to the limit imposed by the model's
209+
`max_sequence_length` the `is_truncated` field appears in the response.
209210

210211
[source,console-result]
211212
----

docs/reference/ml/trained-models/apis/infer-trained-model.asciidoc

Lines changed: 13 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,11 @@
66
<titleabbrev>Infer trained model</titleabbrev>
77
++++
88

9-
Evaluates a trained model. The model may be any supervised model either trained by {dfanalytics} or imported.
9+
Evaluates a trained model. The model may be any supervised model either trained
10+
by {dfanalytics} or imported.
11+
12+
NOTE: For model deployments with caching enabled, results may be returned
13+
directly from the {infer} cache.
1014

1115
preview::[]
1216

@@ -102,7 +106,8 @@ include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-fill-mask]
102106
=====
103107
`num_top_classes`::::
104108
(Optional, integer)
105-
Number of top predicted tokens to return for replacing the mask token. Defaults to `0`.
109+
Number of top predicted tokens to return for replacing the mask token. Defaults
110+
to `0`.
106111

107112
`results_field`::::
108113
(Optional, string)
@@ -272,7 +277,8 @@ The maximum amount of words in the answer. Defaults to `15`.
272277

273278
`num_top_classes`::::
274279
(Optional, integer)
275-
The number the top found answers to return. Defaults to `0`, meaning only the best found answer is returned.
280+
The number the top found answers to return. Defaults to `0`, meaning only the
281+
best found answer is returned.
276282

277283
`question`::::
278284
(Required, string)
@@ -368,7 +374,8 @@ include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-text-classific
368374

369375
`num_top_classes`::::
370376
(Optional, integer)
371-
Specifies the number of top class predictions to return. Defaults to all classes (-1).
377+
Specifies the number of top class predictions to return. Defaults to all classes
378+
(-1).
372379

373380
`results_field`::::
374381
(Optional, string)
@@ -771,8 +778,8 @@ POST _ml/trained_models/model2/_infer
771778
--------------------------------------------------
772779
// TEST[skip:TBD]
773780

774-
When the input has been truncated due to the limit imposed by the model's `max_sequence_length`
775-
the `is_truncated` field appears in the response.
781+
When the input has been truncated due to the limit imposed by the model's
782+
`max_sequence_length` the `is_truncated` field appears in the response.
776783

777784
[source,console-result]
778785
----

docs/reference/ml/trained-models/apis/start-trained-model-deployment.asciidoc

Lines changed: 25 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -30,20 +30,20 @@ in an ingest pipeline or directly in the <<infer-trained-model>> API.
3030
Scaling inference performance can be achieved by setting the parameters
3131
`number_of_allocations` and `threads_per_allocation`.
3232

33-
Increasing `threads_per_allocation` means more threads are used when
34-
an inference request is processed on a node. This can improve inference speed
35-
for certain models. It may also result in improvement to throughput.
33+
Increasing `threads_per_allocation` means more threads are used when an
34+
inference request is processed on a node. This can improve inference speed for
35+
certain models. It may also result in improvement to throughput.
3636

37-
Increasing `number_of_allocations` means more threads are used to
38-
process multiple inference requests in parallel resulting in throughput
39-
improvement. Each model allocation uses a number of threads defined by
37+
Increasing `number_of_allocations` means more threads are used to process
38+
multiple inference requests in parallel resulting in throughput improvement.
39+
Each model allocation uses a number of threads defined by
4040
`threads_per_allocation`.
4141

42-
Model allocations are distributed across {ml} nodes. All allocations assigned
43-
to a node share the same copy of the model in memory. To avoid
44-
thread oversubscription which is detrimental to performance, model allocations
45-
are distributed in such a way that the total number of used threads does not
46-
surpass the node's allocated processors.
42+
Model allocations are distributed across {ml} nodes. All allocations assigned to
43+
a node share the same copy of the model in memory. To avoid thread
44+
oversubscription which is detrimental to performance, model allocations are
45+
distributed in such a way that the total number of used threads does not surpass
46+
the node's allocated processors.
4747

4848
[[start-trained-model-deployment-path-params]]
4949
== {api-path-parms-title}
@@ -57,33 +57,36 @@ include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-id]
5757

5858
`cache_size`::
5959
(Optional, <<byte-units,byte value>>)
60-
The inference cache size (in memory outside the JVM heap) per node for the model.
61-
The default value is the same size as the `model_size_bytes`. To disable the cache, `0b` can be provided.
60+
The inference cache size (in memory outside the JVM heap) per node for the
61+
model. The default value is the size of the model as reported by the
62+
`model_size_bytes` field in the <<get-trained-models-stats>>. To disable the
63+
cache, `0b` can be provided.
6264

6365
`number_of_allocations`::
6466
(Optional, integer)
6567
The total number of allocations this model is assigned across {ml} nodes.
66-
Increasing this value generally increases the throughput.
67-
Defaults to 1.
68+
Increasing this value generally increases the throughput. Defaults to 1.
6869

6970
`queue_capacity`::
7071
(Optional, integer)
7172
Controls how many inference requests are allowed in the queue at a time.
7273
Every machine learning node in the cluster where the model can be allocated
7374
has a queue of this size; when the number of requests exceeds the total value,
74-
new requests are rejected with a 429 error. Defaults to 1024. Max allowed value is 1000000.
75+
new requests are rejected with a 429 error. Defaults to 1024. Max allowed value
76+
is 1000000.
7577

7678
`threads_per_allocation`::
7779
(Optional, integer)
78-
Sets the number of threads used by each model allocation during inference. This generally increases
79-
the speed per inference request. The inference process is a compute-bound process;
80-
`threads_per_allocations` must not exceed the number of available allocated processors per node.
81-
Defaults to 1. Must be a power of 2. Max allowed value is 32.
80+
Sets the number of threads used by each model allocation during inference. This
81+
generally increases the speed per inference request. The inference process is a
82+
compute-bound process; `threads_per_allocations` must not exceed the number of
83+
available allocated processors per node. Defaults to 1. Must be a power of 2.
84+
Max allowed value is 32.
8285

8386
`timeout`::
8487
(Optional, time)
85-
Controls the amount of time to wait for the model to deploy. Defaults
86-
to 20 seconds.
88+
Controls the amount of time to wait for the model to deploy. Defaults to 20
89+
seconds.
8790

8891
`wait_for`::
8992
(Optional, string)

0 commit comments

Comments
 (0)