@@ -182,60 +182,44 @@ different from the query's vector, 0 is used for missing dimensions
182
182
in the calculations of vector functions.
183
183
184
184
185
- [[random-functions ]]
186
- ===== Random functions
187
- There are two predefined ways to produce random values:
188
- `randomNotReproducible` and `randomReproducible` .
185
+ [[random-score-function ]]
186
+ ===== Random score function
187
+ `random_score` function generates scores that are uniformly distributed
188
+ from 0 up to but not including 1 .
189
189
190
- `randomNotReproducible()` uses `java.util.Random` class
191
- to generate a random value of the type `long`.
192
- The generated values are not reproducible between requests' invocations.
190
+ `randomScore` function has the following syntax:
191
+ `randomScore(<seed>, <fieldName>)`.
192
+ It has a required parameter - `seed` as an integer value,
193
+ and an optional parameter - `fieldName` as a string value.
193
194
194
195
[source,js]
195
196
--------------------------------------------------
196
197
"script" : {
197
- "source" : "randomNotReproducible( )"
198
+ "source" : "randomScore(100, '_seq_no' )"
198
199
}
199
200
--------------------------------------------------
200
201
// NOTCONSOLE
201
202
202
-
203
- `randomReproducible(String seedValue, int seed)` produces
204
- reproducible random values of type `long`. This function requires
205
- more computational time and memory than the non-reproducible version.
206
-
207
- A good candidate for the `seedValue` is document field values that
208
- are unique across documents and already pre-calculated and preloaded
209
- in the memory. For example, values of the document's `_seq_no` field
210
- is a good candidate, as documents on the same shard have unique values
211
- for the `_seq_no` field.
203
+ If the `fieldName` parameter is omitted, the internal Lucene
204
+ document ids will be used as a source of randomness. This is very efficient,
205
+ but unfortunately not reproducible since documents might be renumbered
206
+ by merges.
212
207
213
208
[source,js]
214
209
--------------------------------------------------
215
210
"script" : {
216
- "source" : "randomReproducible(Long.toString(doc['_seq_no'].value), 100)"
211
+ "source" : "randomScore( 100)"
217
212
}
218
213
--------------------------------------------------
219
214
// NOTCONSOLE
220
215
221
216
222
- A drawback of using `_seq_no` is that generated values change if
223
- documents are updated. Another drawback is not absolute uniqueness, as
224
- documents from different shards with the same sequence numbers
225
- generate the same random values.
226
-
227
- If you need random values to be distinct across different shards,
228
- you can use a field with unique values across shards,
229
- such as `_id`, but watch out for the memory usage as all
230
- these unique values need to be loaded into memory.
231
-
232
- [source,js]
233
- --------------------------------------------------
234
- "script" : {
235
- "source" : "randomReproducible(doc['_id'].value, 100)"
236
- }
237
- --------------------------------------------------
238
- // NOTCONSOLE
217
+ Note that documents that are within the same shard and have the
218
+ same value for field will get the same score, so it is usually desirable
219
+ to use a field that has unique values for all documents across a shard.
220
+ A good default choice might be to use the `_seq_no`
221
+ field, whose only drawback is that scores will change if the document is
222
+ updated since update operations also update the value of the `_seq_no` field.
239
223
240
224
241
225
[[decay-functions]]
@@ -349,8 +333,8 @@ the following script:
349
333
350
334
===== `random_score`
351
335
352
- Use `randomReproducible` and `randomNotReproducible` functions
353
- as described in <<random-functions , random functions >>.
336
+ Use `randomScore` function
337
+ as described in <<random-score-function , random score function >>.
354
338
355
339
356
340
===== `field_value_factor`
0 commit comments