@@ -102,6 +102,7 @@ in {kib} or the {ref}/put-dfanalytics.html[create {dfanalytics-jobs}] API.
102
102
103
103
[role="screenshot"]
104
104
image::images/flights-classification-job-1.jpg["Creating a {dfanalytics-job} in {kib}"]
105
+ --
105
106
106
107
.. Choose `kibana_sample_data_flights` as the source index.
107
108
.. Choose `classification` as the job type.
@@ -111,6 +112,23 @@ want to predict with the {classanalysis}.
111
112
excluded fields. These fields will be excluded from the analysis. It is
112
113
recommended to exclude fields that either contain erroneous data or describe the
113
114
`dependent_variable`.
115
+ +
116
+ --
117
+ The wizard includes a scatterplot matrix, which enables you to explore the
118
+ relationships between the numeric fields. The color of each point is affected by
119
+ the value of the dependent variable for that document, as shown in the legend.
120
+ You can use this matrix to help you decide which fields to include or exclude
121
+ from the analysis.
122
+
123
+ [role="screenshot"]
124
+ image::images/flights-classification-scatterplot.png["A scatterplot matrix for three fields in {kib}"]
125
+
126
+ If you want these charts to represent data from a larger sample size or from a
127
+ randomized selection of documents, you can change the default behavior. However,
128
+ a larger sample size might slow down the performance of the matrix and a
129
+ randomized selection might put more load on the cluster due to the more
130
+ intensive query.
131
+ --
114
132
.. Choose a training percent of `10` which means it randomly selects 10% of the
115
133
source data for training. While that value is low for this example, for many
116
134
large data sets using a small training sample greatly reduces runtime without
@@ -129,8 +147,8 @@ analysis. In {kib}, the index name matches the job ID by default. It will
129
147
contain a copy of the source index data where each document is annotated with
130
148
the results. If the index does not exist, it will be created automatically.
131
149
.. Use default values for all other options.
132
-
133
-
150
+ +
151
+ --
134
152
.API example
135
153
[%collapsible]
136
154
====
0 commit comments