Skip to content

Commit 54064a1

Browse files
Unsigned long 64bits(#62892)
Introduce 64-bit unsigned long field type This field type supports - indexing of integer values from [0, 18446744073709551615] - precise queries (term, range) - precise sort and terms aggregations - other aggregations are based on conversion of long values to double and can be imprecise for large values. Backport for #60050 Closes #32434
1 parent a43f29c commit 54064a1

File tree

33 files changed

+2603
-30
lines changed

33 files changed

+2603
-30
lines changed

docs/reference/mapping/types/numeric.asciidoc

+4-1
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ The following numeric types are supported:
1515
`float`:: A single-precision 32-bit IEEE 754 floating point number, restricted to finite values.
1616
`half_float`:: A half-precision 16-bit IEEE 754 floating point number, restricted to finite values.
1717
`scaled_float`:: A floating point number that is backed by a `long`, scaled by a fixed `double` scaling factor.
18+
`unsigned_long`:: An unsigned 64-bit integer with a minimum value of 0 and a maximum value of +2^64^-1+.
1819

1920
Below is an example of configuring a mapping with numeric fields:
2021

@@ -115,7 +116,7 @@ The following parameters are accepted by numeric types:
115116
<<coerce,`coerce`>>::
116117

117118
Try to convert strings to numbers and truncate fractions for integers.
118-
Accepts `true` (default) and `false`.
119+
Accepts `true` (default) and `false`. Not applicable for `unsigned_long`.
119120

120121
<<mapping-boost,`boost`>>::
121122

@@ -169,3 +170,5 @@ The following parameters are accepted by numeric types:
169170
sorting) will behave as if the document had a value of +2.3+. High values
170171
of `scaling_factor` improve accuracy but also increase space requirements.
171172
This parameter is required.
173+
174+
include::unsigned_long.asciidoc[]
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
[role="xpack"]
2+
[testenv="basic"]
3+
4+
[[unsigned-long]]
5+
=== Unsigned long data type
6+
Unsigned long is a numeric field type that represents an unsigned 64-bit
7+
integer with a minimum value of 0 and a maximum value of +2^64^-1+
8+
(from 0 to 18446744073709551615 inclusive).
9+
10+
[source,console]
11+
--------------------------------------------------
12+
PUT my_index
13+
{
14+
"mappings": {
15+
"properties": {
16+
"my_counter": {
17+
"type": "unsigned_long"
18+
}
19+
}
20+
}
21+
}
22+
--------------------------------------------------
23+
24+
Unsigned long can be indexed in a numeric or string form,
25+
representing integer values in the range [0, 18446744073709551615].
26+
They can't have a decimal part.
27+
28+
[source,console]
29+
--------------------------------
30+
POST /my_index/_bulk?refresh
31+
{"index":{"_id":1}}
32+
{"my_counter": 0}
33+
{"index":{"_id":2}}
34+
{"my_counter": 9223372036854775808}
35+
{"index":{"_id":3}}
36+
{"my_counter": 18446744073709551614}
37+
{"index":{"_id":4}}
38+
{"my_counter": 18446744073709551615}
39+
--------------------------------
40+
//TEST[continued]
41+
42+
Term queries accept any numbers in a numeric or string form.
43+
44+
[source,console]
45+
--------------------------------
46+
GET /my_index/_search
47+
{
48+
"query": {
49+
"term" : {
50+
"my_counter" : 18446744073709551615
51+
}
52+
}
53+
}
54+
--------------------------------
55+
//TEST[continued]
56+
57+
Range query terms can contain values with decimal parts.
58+
In this case {es} converts them to integer values:
59+
`gte` and `gt` terms are converted to the nearest integer up inclusive,
60+
and `lt` and `lte` ranges are converted to the nearest integer down inclusive.
61+
62+
It is recommended to pass ranges as strings to ensure they are parsed
63+
without any loss of precision.
64+
65+
[source,console]
66+
--------------------------------
67+
GET /my_index/_search
68+
{
69+
"query": {
70+
"range" : {
71+
"my_counter" : {
72+
"gte" : "9223372036854775808.5",
73+
"lte" : "18446744073709551615"
74+
}
75+
}
76+
}
77+
}
78+
--------------------------------
79+
//TEST[continued]
80+
81+
82+
For queries with sort on an `unsigned_long` field,
83+
for a particular document {es} returns a sort value of the type `long`
84+
if the value of this document is within the range of long values,
85+
or of the type `BigInteger` if the value exceeds this range.
86+
87+
NOTE: REST clients need to be able to handle big integer values
88+
in JSON to support this field type correctly.
89+
90+
[source,console]
91+
--------------------------------
92+
GET /my_index/_search
93+
{
94+
"query": {
95+
"match_all" : {}
96+
},
97+
"sort" : {"my_counter" : "desc"}
98+
}
99+
--------------------------------
100+
//TEST[continued]
101+
102+
Similarly to sort values, script values of an `unsigned_long` field
103+
return a `Number` representing a `Long` or `BigInteger`.
104+
The same values: `Long` or `BigInteger` are used for `terms` aggregations.
105+
106+
==== Queries with mixed numeric types
107+
108+
Searches with mixed numeric types one of which is `unsigned_long` are
109+
supported, except queries with sort. Thus, a sort query across two indexes
110+
where the same field name has an `unsigned_long` type in one index,
111+
and `long` type in another, doesn't produce correct results and must
112+
be avoided. If there is a need for such kind of sorting, script based sorting
113+
can be used instead.
114+
115+
Aggregations across several numeric types one of which is `unsigned_long` are
116+
supported. In this case, values are converted to the `double` type.

modules/lang-painless/src/main/java/org/elasticsearch/painless/Def.java

+10-3
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@
2929
import java.lang.invoke.MethodHandle;
3030
import java.lang.invoke.MethodHandles;
3131
import java.lang.invoke.MethodType;
32+
import java.math.BigInteger;
3233
import java.time.ZonedDateTime;
3334
import java.util.BitSet;
3435
import java.util.Collections;
@@ -734,6 +735,8 @@ public static double defTodoubleImplicit(final Object value) {
734735
return (float)value;
735736
} else if (value instanceof Double) {
736737
return (double)value;
738+
} else if (value instanceof BigInteger) {
739+
return ((BigInteger)value).doubleValue();
737740
} else {
738741
throw new ClassCastException("cannot implicitly cast " +
739742
"def [" + PainlessLookupUtility.typeToUnboxedType(value.getClass()).getCanonicalName() + "] to " +
@@ -866,7 +869,8 @@ public static double defTodoubleExplicit(final Object value) {
866869
value instanceof Integer ||
867870
value instanceof Long ||
868871
value instanceof Float ||
869-
value instanceof Double
872+
value instanceof Double ||
873+
value instanceof BigInteger
870874
) {
871875
return ((Number)value).doubleValue();
872876
} else {
@@ -1004,7 +1008,9 @@ public static Double defToDoubleImplicit(final Object value) {
10041008
} else if (value instanceof Float) {
10051009
return (double)(float)value;
10061010
} else if (value instanceof Double) {
1007-
return (Double)value;
1011+
return (Double) value;
1012+
} else if (value instanceof BigInteger) {
1013+
return ((BigInteger)value).doubleValue();
10081014
} else {
10091015
throw new ClassCastException("cannot implicitly cast " +
10101016
"def [" + PainlessLookupUtility.typeToUnboxedType(value.getClass()).getCanonicalName() + "] to " +
@@ -1151,7 +1157,8 @@ public static Double defToDoubleExplicit(final Object value) {
11511157
value instanceof Integer ||
11521158
value instanceof Long ||
11531159
value instanceof Float ||
1154-
value instanceof Double
1160+
value instanceof Double ||
1161+
value instanceof BigInteger
11551162
) {
11561163
return ((Number)value).doubleValue();
11571164
} else {

modules/lang-painless/src/test/java/org/elasticsearch/painless/DefCastTests.java

+1
Original file line numberDiff line numberDiff line change
@@ -166,6 +166,7 @@ public void testdefTodoubleImplicit() {
166166
assertEquals((double)0, exec("def d = Long.valueOf(0); double b = d; b"));
167167
assertEquals((double)0, exec("def d = Float.valueOf(0); double b = d; b"));
168168
assertEquals((double)0, exec("def d = Double.valueOf(0); double b = d; b"));
169+
assertEquals((double)0, exec("def d = BigInteger.valueOf(0); double b = d; b"));
169170
expectScriptThrows(ClassCastException.class, () -> exec("def d = new ArrayList(); double b = d;"));
170171
}
171172

server/src/main/java/org/elasticsearch/action/search/SearchPhaseController.java

+31
Original file line numberDiff line numberDiff line change
@@ -427,6 +427,7 @@ ReducedQueryPhase reducedQueryPhase(Collection<? extends SearchPhaseResult> quer
427427
if (queryResults.isEmpty()) {
428428
throw new IllegalStateException(errorMsg);
429429
}
430+
validateMergeSortValueFormats(queryResults);
430431
final QuerySearchResult firstResult = queryResults.stream().findFirst().get().queryResult();
431432
final boolean hasSuggest = firstResult.suggest() != null;
432433
final boolean hasProfileResults = firstResult.hasProfileResults();
@@ -486,6 +487,36 @@ private static InternalAggregations reduceAggs(InternalAggregation.ReduceContext
486487
performFinalReduce ? aggReduceContextBuilder.forFinalReduction() : aggReduceContextBuilder.forPartialReduction());
487488
}
488489

490+
/**
491+
* Checks that query results from all shards have consistent unsigned_long format.
492+
* Sort queries on a field that has long type in one index, and unsigned_long in another index
493+
* don't work correctly. Throw an error if this kind of sorting is detected.
494+
* //TODO: instead of throwing error, find a way to sort long and unsigned_long together
495+
*/
496+
private static void validateMergeSortValueFormats(Collection<? extends SearchPhaseResult> queryResults) {
497+
boolean[] ulFormats = null;
498+
boolean firstResult = true;
499+
for (SearchPhaseResult entry : queryResults) {
500+
DocValueFormat[] formats = entry.queryResult().sortValueFormats();
501+
if (formats == null) return;
502+
if (firstResult) {
503+
firstResult = false;
504+
ulFormats = new boolean[formats.length];
505+
for (int i = 0; i < formats.length; i++) {
506+
ulFormats[i] = formats[i] == DocValueFormat.UNSIGNED_LONG_SHIFTED ? true : false;
507+
}
508+
} else {
509+
for (int i = 0; i < formats.length; i++) {
510+
// if the format is unsigned_long in one shard, and something different in another shard
511+
if (ulFormats[i] ^ (formats[i] == DocValueFormat.UNSIGNED_LONG_SHIFTED)) {
512+
throw new IllegalArgumentException("Can't do sort across indices, as a field has [unsigned_long] type " +
513+
"in one index, and different type in another index!");
514+
}
515+
}
516+
}
517+
}
518+
}
519+
489520
/*
490521
* Returns the size of the requested top documents (from + size)
491522
*/

server/src/main/java/org/elasticsearch/common/io/stream/StreamInput.java

+8
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@
5151
import java.io.FilterInputStream;
5252
import java.io.IOException;
5353
import java.io.InputStream;
54+
import java.math.BigInteger;
5455
import java.nio.file.AccessDeniedException;
5556
import java.nio.file.AtomicMoveNotSupportedException;
5657
import java.nio.file.DirectoryNotEmptyException;
@@ -329,6 +330,11 @@ public Long readOptionalLong() throws IOException {
329330
return null;
330331
}
331332

333+
public BigInteger readBigInteger() throws IOException {
334+
return new BigInteger(readString());
335+
}
336+
337+
332338
@Nullable
333339
public Text readOptionalText() throws IOException {
334340
int length = readInt();
@@ -741,6 +747,8 @@ public Object readGenericValue() throws IOException {
741747
return readCollection(StreamInput::readGenericValue, LinkedHashSet::new, Collections.emptySet());
742748
case 25:
743749
return readCollection(StreamInput::readGenericValue, HashSet::new, Collections.emptySet());
750+
case 26:
751+
return readBigInteger();
744752
default:
745753
throw new IOException("Can't read unknown type [" + type + "]");
746754
}

server/src/main/java/org/elasticsearch/common/io/stream/StreamOutput.java

+6
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@
5151
import java.io.FileNotFoundException;
5252
import java.io.IOException;
5353
import java.io.OutputStream;
54+
import java.math.BigInteger;
5455
import java.nio.file.AccessDeniedException;
5556
import java.nio.file.AtomicMoveNotSupportedException;
5657
import java.nio.file.DirectoryNotEmptyException;
@@ -803,6 +804,11 @@ public final void writeOptionalInstant(@Nullable Instant instant) throws IOExcep
803804
}
804805
o.writeCollection((Set<?>) v, StreamOutput::writeGenericValue);
805806
});
807+
// TODO: improve serialization of BigInteger
808+
writers.put(BigInteger.class, (o, v) -> {
809+
o.writeByte((byte) 26);
810+
o.writeString(v.toString());
811+
});
806812
WRITERS = Collections.unmodifiableMap(writers);
807813
}
808814

server/src/main/java/org/elasticsearch/common/lucene/Lucene.java

+9
Original file line numberDiff line numberDiff line change
@@ -95,6 +95,7 @@
9595
import org.elasticsearch.index.fielddata.IndexFieldData;
9696

9797
import java.io.IOException;
98+
import java.math.BigInteger;
9899
import java.text.ParseException;
99100
import java.util.ArrayList;
100101
import java.util.Arrays;
@@ -369,6 +370,8 @@ public static FieldDoc readFieldDoc(StreamInput in) throws IOException {
369370
cFields[j] = in.readBoolean();
370371
} else if (type == 9) {
371372
cFields[j] = in.readBytesRef();
373+
} else if (type == 10) {
374+
cFields[j] = new BigInteger(in.readString());
372375
} else {
373376
throw new IOException("Can't match type [" + type + "]");
374377
}
@@ -398,6 +401,8 @@ public static Comparable readSortValue(StreamInput in) throws IOException {
398401
return in.readBoolean();
399402
} else if (type == 9) {
400403
return in.readBytesRef();
404+
}else if (type == 10) {
405+
return new BigInteger(in.readString());
401406
} else {
402407
throw new IOException("Can't match type [" + type + "]");
403408
}
@@ -517,6 +522,10 @@ public static void writeSortValue(StreamOutput out, Object field) throws IOExcep
517522
} else if (type == BytesRef.class) {
518523
out.writeByte((byte) 9);
519524
out.writeBytesRef((BytesRef) field);
525+
} else if (type == BigInteger.class) {
526+
//TODO: improve serialization of BigInteger
527+
out.writeByte((byte) 10);
528+
out.writeString(field.toString());
520529
} else {
521530
throw new IOException("Can't handle sort field value of type [" + type + "]");
522531
}

0 commit comments

Comments
 (0)